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INSECTICIDAL PROTEIN TOXINS FROM PHOTORHABDUS 

Cross-reference to Related Application 
This patent application is a continuation-in-part of U.S. 
Patent Application Serial Number 08/743,699 filed on 
November 6, 1996, which is a continuation-in-part of U.S. Patent 
Application Serial Number 08/705,484 filed on August 28, 1996, 
which is a continuation-in-part of U.S. Patent Application Serial 
Number 08/608,423 filed February 28, 1996, which is a continuation- 
in-part of U.S. Patent Application Serial Number 08/395,947 filed 
February 28, 1995, which was a continuation-in-part of U.S. Patent 
Application Serial Number 08/063,615 filed May 18, 1993. This 
application is also a continuation-in-part of provisional U.S. 
Patent Application Serial Number 60/007,255 filed November 6, 1995. 

Field of the Invention 

The present invention relates to toxins isolated from bacteria 
and the use of said toxins as insecticides. 

Background gf the Invention 

Many insects are widely regarded as pests to homeowners, to 
picnickers, to gardeners, and to farmers and others whose 
investments in agricultural products are often destroyed or 
diminished as a result of insect damage to field crops. 
Particularly in areas where the growing season is short, 
significant insect damage can mean the loss of all profits to 
30 growers and a dramatic decrease in crop yield. Scarce supply of 

particular agricultural products invariably results in higher costs 
to food processors and, then, to the ultimate consumers of food 
plants and products derived from those plants. 

Preventing insect damage to crops and flowers and eliminating 

3 5 the nuisance of insect pests have typically relied on strong 

organic pesticides and insecticides with broad toxicities. These 
synthetic products have come under attack by the general population 
as being too harsh on the environment and on those exposed to such 
agents. Similarly in non-agricultural settings, homeowners would 

4 0 be satisfied to have insects avoid their homes or outdoor meals 

without needing to kill the insects. 

The extensive use of chemical insecticides has raised 
environmental and health concerns for farmers, companies that 
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groups, and the public in general. The development of less 
intrusive pest management strategies has been spurred along both by 
societal concern for the environment and by the development of 
5 biological tools which exploit mechanisms of insect management. 
Biological control agents present a promising alternative to 
chemical insecticides . 

Organisms at every evolutionary development level have devised 
means to enhance their own success and survival . The use of 

10 biological molecules as tools of defense and aggression is known 
throughout the animal and plant kingdoms. In addition, the 
relatively new tools of the genetic engineer allow modifications to 
biological insecticides to accomplish particular solutions to 
particular problems. 

15 One such agent, Bacillus thuringiensis (Bt) , is an effective 

insecticidal agent, and is widely commercially used as such. In 
fact, the insecticidal agent of the Bt bacterium is a protein which 
has such limited toxicity, it can be used on human food crops on 
the day of harvest. To non-targeted organisms, the Bt toxin is a 

20 digestible non-toxic protein. 

Another known class of biological insect control agents are 
certain genera of nematodes known to be vectors of transmission for 
insect-killing bacterial symbionts. Nematodes containing 
insecticidal bacteria invade insect larvae. The bacteria then kill 

25 the larvae. The nematodes reproduce in the larval cadaver. The 
nematode progeny then eat the cadaver from within. The bacteria - 
containing nematode progeny thus produced can then invade 
additional larvae . 



30 Heterorhabditis genera were used as insect control agents. 

Apparently, each genus of nematode hosts a particular species of 
bacterium. In nematodes of the Heterorhabditis genus, the 
symbiotic bacterium is Photorhabdus luminescens . 



35 it is presently difficult, expensive, and inefficient to produce, 
maintain, and distribute nematodes for insect control. 

It has been known in the art that one may isolate an 
insecticidal toxin from Photorhabdus luminescens that has activity 
only when injected into Lepidopteran and Coleopteran insect larvae. 

4 0 This has made it impossible to effectively exploit the insecticidal 
properties of the nematode or its bacterial symbiont . What would 
be useful would be a more practical, less labor-intensive wide-area 
delivery method of an insecticidal toxin which would retain its 



In the past, insecticidal nematodes in the Steinernema and 



Although these nematodes are effective insect control agents, 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 9808932A1 J_> 



WO 98/08932 PCT/US97/07657 



• m 

iey after delivery. It would bflBfud 



biological properties after delivery. It would blBfuite desirous 
to discover toxins with oral activity produced by the genus 
Photorhabdus. The isolation and use of these toxins are desirous 
due to efficacious reasons. Until applicants' discoveries, these 
5 toxins had not been isolated or characterized. 

Summary of the invention 

The native toxins are protein complexes that are produced and 
10 secreted by growing bacteria cells of the genus Photorhabdus, of 
interest are the proteins produced by the species Photorhabdus 
luminescens . The protein complexes, with a molecular size of 
approximately 1,000 kDa, can be separated by SDS-PAGE gel analysis 
into numerous component proteins. The toxins contain no hemolysin, 
15 lipase, type C phospholipase , or nuclease activities. The toxins 
exhibit significant toxicity upon exposure administration to a 
number of insects . 

The present invention provides an easily administered 
insecticidal protein as well as the expression of toxin in a 
20 heterologous system. 

The present invention also provides a method for delivering 
insecticidal toxins that are functional active and effective 
against many orders of insects. 

Objects, advantages, and features of the present invention 
2 5 will become apparent from the following specification. 



Brief Description of the Drawings 



Fig. 1 is an illustration of a match of cloned DNA isolates 
30 used as a part of sequence genes for the toxin of the present 
invention . 

Fig. 2 is a map of three plasmids used in the sequencing 
process . 

Fig. 3 is a map illustrating the inter -relationship of several 
35 partial DNA fragments. 

Fig. 4 is an illustration of a homology analysis between the 
protein sequences of TcbAii and TcaBii proteins. 

Fig. 5 is a phenogram of Photorhabdus strains. Relationship 
of Photorhabdus Strains was defined by rep-PCR. 
4 0 The upper axis of Fig. 5 measures the percentage similarity of 
strains based on scoring of rep-PCR products (i.e., 0.0 Ino 
similarity] to 1.0 [100% similarity]). At the right axis, the 
numbers and letters indicate the various strains tested; 14=W-14, 
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Hm=Hm, H9«H9 ,^B?WX-7 , 1=WX-1 , 2=WX-2, 88=HP8B, NC-1=NC-1, 4=WX~4, 
9=WX-9, 8=WX-8, 10=WX-10, WIR-WIR, 3=WX-3, 11=WX-11, S=WX-5, 6=WX- 
6, 12=WX~12, xl4*WX-14, 15=WX-15 , Hb=Hb, B2=B2, 48 through 52=ATCC 
43948 through ATCC 43952. Vertical lines separating horizontal 
5 lines indicate the degree of relatedness (as read from the 

extrapolated intersection of the vertical line with the upper axis) 
between strains or groups of strains at the base of the horizontal 
lines (e.g., strain W-14 is approximately 60% similar to strains H9 
and Hm) . 

10 Fig. 6 is an illustration of the genomic maps of the W-14 



Fig. 6A is an illustration of the tea and tcb loci and primary 
gene products . 

Fig. 7 is a phenogram of Photorhabdus strains as defined by 
15 rep-PCR. The upper axis of Fig. 7 measures the percentage 

similarity of strains based on scoring of rep-PCR products (i.e., 
0.0 [no similarity] to 1.0 [100% similarity)). At the right axis, 
the numbers and letters indicate the various strains tested. 
Vertical lines separating horizontal lines indicate the degree of 
2 0 relatedness (as read from the extrapolated intersection of the 
vertical line with the upper axis) between strains or groups of 
strains at the base of the horizontal lines (e.g., strain Indicus 
is approximately 30% similar to strains MP1 and HB Oswego) . Note 
that the Photorhabdus strains on the phenogram are as follows: 14 
25 = W-14; Hm = Hm; H9 - H9; 7 = WX - 7 ; 1 = WX-1; 2 = WX-2; 88 = HP88 ; 

NCI = NOl; 4 = WX-4; 9 = WX-9; 8 = WX-8; 10 = WX-10; 30 = W30; WIR 
= WIR; 3 - WX-3; 11 = WX-11; 5 = WX-5; 6 = WX-6; 12 = WX-12; 15 - 
WX-15; X14 = WX-14; Hb = Hb; B2 = B2 ; 48 = ATCC 43948; 49 = ATCC 
43949; 50 = ATCC 43950; 51 = ATCC 43951; 52 = ATCC 43952. 



3 5 unique class of insect icidal protein toxins from the genus 

Photorhabdus that have oral toxicity against insects. A unique 
feature of Photorhabdus is its bioluminescence . Photorhabdus may 
be isolated from a variety of sources. One such source is 
nematodes, more particularly nematodes of the genus 

4 0 HeterorhaJbdi tis. Another such source is from human clinical 

samples from wounds, see Farmer et al . 1989 J. Clin. Microbiol. 27 



Strain. 



30 



Detailed Description, pf the inventagn 



The present inventions are directed to the discovery of a 
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saprohytic strains are dep< 




ed in the 



10 



i =» 



20 



25 



30 



35 



American Type Culture Collection (Rockville, MD) ATCC #s 43948, 
43949, 43950, 43951, and 43952, and are incorporated herein by 
reference. It is possible that other sources could harbor 
Photorhabdus bacteria that produce insecticidal toxins. Such 
sources in the environment could be either terrestrial or aquatic 
based . 

The genus Photorhabdus is taxonomically defined as a member of 
the Family Enterobacteriaceae, although it has certain traits 
atypical of this family. For example, strains of this genus are 
nitrate reduction negative, yellow and red pigment producing and 
bioluminescent . This latter trait is otherwise unknown within the 
Enterobacteriaceae. Photorhabdus has only recently been described 
as a genus separate from the Xenorhabdus (Boemare et al . , 1993 Int. 
J. Syst. Bacteriol. 43, 249-255). This differentiation is based on 
DNA-DNA hybridization studies, phenotypic differences (e.g., 
presence {Photorhabdus) or absence {Xenorhabdus) of catalase and 
bioluminescence) and the Family of the nematode host {Xenorhabdus; 
Steinernematidae , Photorhabdus; Heterorhabditidae) . Comparative, 
cellular fatty-acid analyses (Janse et al . 1990, Lett. Appl . 
Microbiol 10, 131-135; Suzuki et al . 1990, J. Gen. Appl. 
Microbiol., 36, 393-401) support the separation of Photorhabdus 
f rom Xenorhabdus . 

In order to establish that the strain collection disclosed 
herein was comprised of Photorhabdus strains, the strains were 
characterized based on recognized traits which define Photorhabdus 
and differentiate it from other Enterobacteriaceae and Xenorhabdus 
species. (Farmer, 1984 Bergey' s Manual of Systemic Bacteriology 
Vol. 1 pp. 510-511; Akhurst and Boemare 1988, J. Gen. Microbiol. 134 
pp. 1835-1845; Boemare et al . 1993 Int. J. Syst. Bacteriol. 43 
pp. 249-255, which are incorporated herein by reference) . The 
traits studied were the following: gram stain negative rods, 
organism size, colony pigmentation, inclusion bodies, presence of 
catalase, ability to reduce nitrate, bioluminescence, dye uptake, 
gelatin hydrolysis, growth on selective media, growth temperature, 
survival under anerobic conditions and motility. Fatty acid 
analysis was used to confirm that the strains herein all belong to 
the single genus Photorhabdus. 

Currently, the bacterial genus Photorhabdus is comprised of a 
single defined species, Photorhabdus luminescens (ATCC Type strain 
#29999, Poinar et al . , 1977, Nematologica 23, 97-102). A variety 
of related strains have been described in the literature (e.g., 
Akhurst et al . 1988 J. Gen. Microbiol., 134, 1835-1845; Boemare 
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et al. 1993 J. Syst. Bacteriol. 43 pp .^^9-255; Putz et al . 

1990, Appl. Environ. Microbiol., 56, 181-186). Numerous 
Photorhahdus strains have been characterized herein. Because there 
is currently only one species (luminescens) defined within the 
5 genus Photorhabdus , the lumlnescens species traits were used to 
characterize the strains herein. As can be seen in Fig. 5, these 
strains are quite diverse. It is not unforeseen that in the future 
there may be other Photorhabdus species that will have some of the 
attributes of the luminescens species as well as some different 

10 characteristics that are presently not defined as a trait of 

Photorhabdus lumlnescens. However, the scope of the invention 
herein is to any Photorhabdus species or strains which produce 
proteins that have functional activity as insect control agents, 
regardless of other traits and characteristics. 

15 Furthermore, as is demonstrated herein, the bacteria of the 

genus Photorhabdus produce proteins that have functional activity 
as defined herein. Of particular interest are proteins produced by 
the species Photorhabdus luminescens. The inventions herein should 
in no way be limited to the strains which are disclosed herein. 

2 0 These strains illustrate for the first time that proteins produced 
by diverse isolates of Photorhabdus are toxic upon exposure to 
insects. Thus, included within the inventions described herein are 
the strains specified herein and any mutants thereof, as well as 
any strains or species of the genus Photorhabdus that have the 

2 5 functional activity described herein. 

There are several terms that are used herein that have a 
particular meaning and are as follows: 

By "functional activity" it is meant herein that the protein 
30 toxin (s) function as insect control agents in that the proteins are 
orally active, or have a toxic effect, or are able to disrupt or 
deter feeding, which may or may not cause death of the insect. 
When an insect comes into contact with an effective amount of toxin 
delivered via transgenic plant expression, formulated protein 
35 compositions (s) , sprayable protein composition (s ) , a bait matrix or 
other delivery system, the results are typically death of the 
insect, or the insects do not feed upon the source which makes the 
toxins available to the insects. 

40 By the use of the term "genetic material" herein, it is meant to 
include all genes, nucleic acid, DNA and RNA. 
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By 



w homolog" it is 




ant an amino acid sequence 



m 



is identified 



as possessing homology to a reference W-14 toxin polypeptide amino 
acid sequence. 

5 By "homology" it is meant an amino acid sequence that has a 

similarity index of at least 33% and/or an identity index of at 
least 26% to a reference W-14 toxin polypeptide amino acid sequence, 
as scored by the GAP algorithm using the BlOsum 62 protein scoring 
matrix (Wisconsin Package Version 9.0, Genetics Computer Group 
10 (GCG) , Madison, WI) . 

By * identity" is meant an amino acid sequence that contains an 
identical residue at a given position, following alignment with a 
reference W-14 toxin polypeptide amino acid sequence by the GAP 
15 algorithm. 

The protein toxins discussed herein are typically referred to as 
"insecticides" . By insecticides it is meant herein that the 
protein toxins have a "functional activity" as further defined 
20 herein and are used as insect control agents. 

By the use of the term "oligonucleotides" it is meant a 
macromolecule consisting of a short chain of nucleotides of either 
RNA or DNA. Such length could be at least one nucleotide, but 
25 typically are in the range of about 10 to about 12 nucleotides. 
The determination of the length of the oligonucleotide is well 
within the skill of an artisan and should not be a limitation 
herein. Therefore, oligonucleotides may be less than 10 or greater 
than 12 . 



By the use of the term " Photorhabdus toxin" it is meant any protein 
produced by a Photorhabdus microorganism strain which has 
functional activity against insects, where the Photorhabdus toxin 
could be formulated as a sprayable composition, expressed by a 
35 transgenic plant, formulated as a bait matrix, delivered via 

baculovirus, or delivered by any other applicable host or delivery 
system. 

By the use of the term ."toxic" or "toxicity" as used herein it is 
4 0 meant that the toxins produced by Photorhabdus have "functional 
activity" as defined herein. 



30 
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By "truncatee^peptide" it is meant herein to include any peptide 
that is fragment (s) of the peptides observed to have functional 
activity. 

5 By "substantial sequence homology" is meant either: a DNA fragment 
having a nucleotide sequence sufficiently similar to another DNA 
fragment to produce a protein having similar biochemical properties; 
or a polypeptide having an amino acid sequence sufficiently similar 
to another polypeptide to exhibit similar biochemical properties. 



Fermentation broths from selected strains reported in Table 20 
were used to determine the following: breadth of insecticidal 
toxin production by the PhotorhaJbdus genus, the insecticidal 
spectrum of these toxins, and to provide source material to purify 
15 the toxin complexes. The strains characterized herein have been 
shown to have oral toxicity against a variety of insect orders . 
Such insect orders include but are not limited to Coleoptera , 
Homoptera, Lepidoptera , Diptera, Acarina, Hymenoptera and 
Dictyoptera . 

2 0 As with other bacterial toxins, the rate of mutation of the 

bacteria in a population causes many related toxins slightly 
different in sequence to exist. Toxins of interest here are those 
which produce protein complexes toxic to a variety of insects upon 
exposure, as described herein. Preferably, the toxins are active 

2 5 against Lepidoptera, Coleoptera, Homopotera, Diptera, Hymenoptera, 
Dictyoptera and Acarina . The inventions herein are intended to 
capture the protein toxins homologous to protein toxins produced by 
the strains herein and any derivative strains thereof, as well as 
any protein toxins produced by PhotorhaJbdus . These homologous 

30 proteins may differ in sequence, but do not differ in function from 
those toxins described herein. Homologous toxins are meant to 
include protein complexes of between 300 kDa to 2,000 kDa and are 
comprised of at least two (2) subunits, where a subunit is a 
peptide which may or may not be the same as the other subunit. 

35 Various protein subunits have been identified and are taught in the 
Examples herein. Typically, the protein subunits are between about 
18 kDa to about 23 0 kDa; between about 160 kDa to about 23 0 kDa; 
100 kDa to 160 kDa; about 80 kDa to about 100 kDa; and about 50 kDa 
to about 80 kDa. 

4 0 As discussed above, some PhotorhaJbdus strains can be isolated 

from nematodes. Some nematodes, elongated cylindrical parasitic 
worms of the phylum Nematoda, have evolved an ability to exploit 
insect larvae as a favored growth environment. The insect larvae 



10 
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provide a source ot^ood for growing . nematodes ano^Pri environment 
in which to reproduce. One dramatic effect that follows invasion 
of larvae by certain nematodes is larval death. Larval death 
results from the presence of, in certain nematodes, bacteria that 
5 produce an insecticidal toxin which arrests larval growth and 
inhibits feeding activity. 

Interestingly, it appears that each genus of insect parasitic 
nematode hosts a particular species of bacterium, uniquely adapted 
for symbiotic growth with that nematode. In the interim since this 

10 research was initiated, the name of the bacterial genus Xenorhabdus 
was reclassified into the Xenorhabdus and the Photorhabdus . 
Bacteria of the genus Photorhabdus are characterized as being 
symbionts of Heterorhabditus nematodes while Xenorhabdus species 
are symbionts of the Stelnernema species. This change in 

15 nomenclature is reflected in this specification, but in no way 

should a change in nomenclature alter the scope of the inventions 
described herein. 

The peptides and genes that are disclosed herein are named 
according to the guidelines recently published in the Journal of 

20 Bacteriology "Instructions to Authors" p. i-xii (Jan. 1996), which 
is incorporated herein by reference. The following peptides and 
genes were isolated from Photorhabdus strain W-14. 
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Peptide /Gene Nomenclature 
Toxin Complex 



1 

Peptide 
Name 


2 

Peptide 
Sequence ID No . * 


i 

Gene 
Name 


4 

Gene 

Sequence ID No.* 


tea genomic region 








TcaA 
TcaAi 
TcaA ijL 
TcaAiii 

TcaA iv 
TcaB 

TcaB ix 
TcaC 


34 c 

pro- pep t ide 
[15)% 34 c 
[4J\ 35 c 
[62] a 

[3] a , (19, 20) b , 26 c 
[3]\ (19, 20)*, 28 c 

[5] a , 30 c 

C2l a , 32 c 


tcaA 
tcaA 

tcaA 

tcaA 

tcaA 

tcaB 
tcafl 

tcaB 

tcaC 


33 

25 

27 

29 
31 


t:ck> aenomic reciion 








TcbA 

TcbAi 
TcbA ijL 

TcbAiii 


12 c , [16) a , (21, 
22, 23, 24) b 
pro-peptide 

[l] a , (21, 22, 23, 
24) b , 53 c 
[40]*, 55 c 


tcbA 

tcbA 
tcbA 

tcbA 


11 

52 
54 


tec genomic region 








TccA 
TccB 
TccC 


[8] a , 57 c 
[7)\ 59 c 
61 C 


tccA 
tCCB 
tccC 


56 
58 
60 


cod genomic region 

TcdA 

TcdA x 
TcdA i:L 

TcdA^^^ 
TcdB 


(17, 18, 37, 38, 
39, 42, 43) u , 47° 
pro-peptide 

[13] a , (17, 18, 37, 
38, 39) b , 49 c 

[41] a , (42, 43)\ 
51 c 

(I4] a 


tcdA 

tcdA 
tcdA 

tcdA 

tcdB 


(36) d , 46 

48 

SO 



"Sequence ID No . * s in brackets are peptide N-termini; 

lumbers in parentheses are N-termini of internal peptide tryptic 

fragments 
c deduced from gene sequence 
10 d internal gene fragment 

The sequences listed above are grouped by genomic region. More 
specifically, the Photorhabdus lumlnesence bacteria (W-14) has at 
least four distinct genomic regions- tea, tcb, tec and ted. As can 

15 be seen in Table 1, peptide products are produced from these 
distinct genomic regions. Furthermore, as illustrated in the 
Examples, specifically Examples 15 and 21, individual gene products 
produced from three genomic regions are associated with insect 
activity. There is also considerable homology between these four 

2 0 genomic regions. 
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As is f urtheif^llustrated in the Examples, 





tct>A gene was 



10 



15 



20 



25 



30 



35 



expressed in E. coli as two possible biological active protein 
fragments (TcbA and TcbAii/iii) . The tccLA gene was also expressed 
in E. coli* As illustrated in Example 16, when the native 
unprocessed TcbA toxin was treated with the endogeneous 
metalloproteases or insect gut contents containing proteases, the 
TcbA protein toxin was processed into smaller subunits that were 
less than the size of the native peptides and Southern Corn 
Rootworm activity increased. The smaller toxin peptides remained 
associated as part of a toxin complex. It may be desirable in some 
situations to increase activation of the toxin (s) by proteolytic 
processing or using truncated peptides. Thus, it may be more 
desirable to use truncated peptide(s) in some applications, i.e., 
commercial transgenic plant applications. 

In addition to . the W-14 strain, there are other species within 
the Photorhabdus genus that have functional activity which is 
differential (specifically see Tables 20 and 36) . Even though there 
is differential activity, the amino acid sequences in some cases 
have substantial sequence homology. Moreover, the molecular probes 
indicate that some genes contained in the strains are homologous to 
the genes contained in the W-14 strain. In fact all of the strains 
illustrated herein have one or more homologs of W-14 toxin genes. 
The antibody data in Example 26 and the N- terminal sequence data in 
Example 25 further support the conclusion that there is homology and 
identity (based on amino acid sequence) between the protein toxin(s) 
produced by these strains. At the molecular level, the W-14 gene 
probes indicated that the homologs or the W-14 genes themselves 
(Tables 37, 38, and 39) are dispersed throughout the Photorhabdus 
genus. Further, it is possible that new toxin genes exist in other 
strains which are not homologous to W-14, but maintain overall 
protein attributes (see specifically Examples 14 and 25) . 

Even though there is homology or identity between toxin genes 
produced by the Photorhabdus strains, the strains themselves are 
quite diverse. Using polymerase chain reaction technology further 
discussed in Example 22, most of the strains illustrated herein are 
quite distinguishable. For example as can be seen in Figs. 5, the 
percentage relative similarity of some of the strains, such as HP88 
and NC-1, was about 0.8, which indicates that the strains are 
similar, while HP88 and Hb was about 0.1, which indicates 
substantial diversity. Therefore, even though the insect toxin 
genes or gene products that the strains produce are the same or 
similar, the strains themselves are diverse. 
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In vie^^ the data further disclosed in the Examples and 
discussions herein, it is clear that a new and unique family of 
insecticidal protein toxin is) has been discovered. It has been 
further illustrated herein that these toxin (s) widely exist within 
5 bacterial strains of the Photorhabdus genus. It may also be the 
case that these toxin genes widely exist within the family 
Enterobacteracaea. Antibodies prepared as described in Example 21 
or gene probes prepared as described in Example 2 5 may be used to 
further screen for bacterial strains within the family 
10 Enterobacteracaea that produce the homologous toxin (s) that have 

functional activity. It may also be the case that specific primer 
sets exist that could facilitate the identification of new genes 
within the Photorhabdus genus or family Enterobacteracaea. 

As stated above, the antibodies may be used to rapidly screen 
15 bacteria of the genus Photorhabdus or the family Enterbacteracaea 
for homologous toxin products as illustrated in Example 26. Those 
skilled in the art are quite familiar with the use of antibodies as 
an analysis or screening tool (see US Patent No. 5,430,137, which is 
incorporated herein by reference) . Moreover, it is generally 
20 accepted in the literature that antibodies are elicited against 6 to 
20 amino acid residue segments that tend to occupy exposed surface 
of polypeptides (Current Protocols in Immunology, Coligan et al, 
National Institutes of Health, John Wiley & Sons, Inc.). Usually 
the amino acid consist of contiguous amino acid residues, however, 
2 5 in certain cases they may be formed by non- contiguous amino acids 
that are constrained by specific conformation. The amino acid 
segments recognized by antibodies are highly specific and commonly 
referred to epitopes. The amino acid fragment can be generated by 
chemical and/or enzymatic cleavage of the native protein, by 
30 automated, ^solid-phase peptide synthesis, or by production from 
genetic engineering organisms. Polypeptide fragments- can be 
isolated by a variety and/or combination of HPLC and FPLC 
chromatographic methods known in the art. Selection of polypeptide 
fragment can be aided by the use of algorithms, for example Kyte and 
35 Doolittle, 1982, Journal of Molecular Biology 157: 105-132 and Chou 
and Fasman, 1974, Biochemistry 13: 222-245, that predict those 
sequences most likely to exposed on the surface of the protein. For 
preparation of immunogen containing the polypeptide fragment of 
interest, in general, polypeptides are covalently coupled using 
chemical reactions to carrier proteins such as keyhole limpet 
hemocyanin via free amino (lysine) , sulfhydyl (cysteine) , phenolic 
(tyrosine) or carboxylic (aspartate or glutamate) groups. Immunogen 
with an adjuvant is injected in animals, such as mice or rabbits, or 
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Complet^^ethality to feeding insects is useful but is not 
required to achieve useful toxicity. If the insects avoid the 
toxin or cease feeding, that avoidance will be useful in some 
applications, even if the effects are sublethal. For example, if 
5 insect resistant transgenic crop plants are desired, a reluctance 
of insects to feed on the plants is as useful as lethal toxicity to 
the insects since the ultimate objective is protection of the 
plants rather than killing the insect. 

There are many other ways in which toxins can be incorporated 

10 into an insect's diet. As an example, it is possible to adulterate 
the larval food source with the toxic protein by spraying the food 
with a protein solution, as disclosed herein. Alternatively, the 
purified protein could be genetically engineered into an otherwise 
harmless bacterium, which could then be grown in culture, and 

15 either applied to the food source or allowed to reside in the soil 
in an area in which insect eradication was desirable. Also, the 
protein could be genetically engineered directly into an insect 
food source. For instance, the major food source of many insect 
larvae is plant material . 

20 By incorporating genetic material that encodes the 

insecticidal properties of the Photorhabdus toxins into the genome 
of a plant eaten by a particular insect pest, the adult or larvae 
would die after consuming the food plant. Numerous members of the 
monocotyledonous and dictyledenous genera have been transformed. 

2 5 Transgenic agronmonic crops as well as fruits and vegetables are of 

commercial interest. Such crops include but are not limited to 
maize, rice, soybeans, canola, sunflower, alfalfa, sorghum, wheat, 
cotton, peanuts, tomatoes, potatoes, and the like. Several 
techniques exist for introducing foreign genetic material into 
30 plant cells, and for obtaining plants that stably maintain and 

express the introduced gene. Such techniques include acceleration 
of genetic material coated onto microparticles directly into cells 
(U.S. Patents 4,945,050 to Cornell and 5,141,131 to DowElanco) . 
Plants may be transformed using Agrobacterium technology, see U.S. 

3 5 Patent 5,177,010 to University of Toledo, 5,104,310 to Texas A&M, 

European Patent Application 0131624B1, European Patent Applications 
120516, 159418B1 and 176,112 to Schilperoot, U.S. Patents 
5,149,645, 5,469,976, 5,464,763 and 4,940,838 and 4,693,976 to 
Schilperoot, European Patent Applications 116718, 290799, 320500 

4 0 all to MaxPlanck, European Patent Applications 604662 and 627752 

to Japan Tobacco, European Patent Applications 0267159, and 029243 5 
and U.S. Patent 5,231,019 all to Ciba Geigy, U.S. Patents 5,463,174 
and 4,762,785 both to Calgene, and U.S. Patents 5,004,863 and 
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5,159,135 both to ;^cetus . Other transformatio chnology 
includes whiskers technology, see U.S. Patents 5,302,523 and 
5,464,765 both to Zeneca. Electroporation technology has also been 
used to transform plants, see WO 87/06614 to Boyce Thompson 
5 Institute, 5,472,669 and 5,384,253 both to Dekalb, WO9209696 and 
W09321335 both to PGS . All of these transformation patents and 
publications are incorporated by reference. In addition to 
numerous technologies for transforming plants, the type of tissue 
which is contacted with the foreign genes may vary as well . Such 
10 tissue would include but would not be limited to embryogenic 

tissue, callus tissue type I and II, hypocotyl , meristem, and the 
like. Almost all plant tissues may be transformed during 
dedif f erentiation using appropriate techniques within the skill of 
an artisan. 

15 Another variable is the choice of a selectable marker. The 

preference for a particular marker is at the discretion of the 
artisan, but any of the following selectable markers may be used 
along with any other gene not listed herein which could function as 
a selectable marker. Such selectable markers include but are not 

20 limited to aminoglycoside phosphotransferase gene of transposon Tn5 
(Aph II) which encodes resistance to the antibiotics kanamycin, 
neomycin and G418, as well as those genes which code for resistance 
or tolerance to glyphosate; hygromycin; methotrexate; 
phosphinothricin (bialophos) ; imi da zoli nones , sulfonylureas and 

2 5 triazolopyrimidine herbicides, such as chlorosulf uron ; bromoxynil, 
dalapon and the like. 

In addition to a selectable marker, it may be desirous to use 
a reporter gene. In some instances a reporter gene may be used 
without a selectable marker. Reporter genes are genes which are 

30 typically not present or expressed in the recipient organism or 
tissue- The reporter gene typically encodes for a protein which 
provides for some phenotypic change or enzymatic property. 
Examples of such genes are provided in K. Weising et al . Ann. Rev. 
Genetics, 22, 421 (1988), which is incorporated herein by 

35 reference. A preferred reporter gene is the glucuronidase (GUS) 
gene . 

Regardless of transformation technique, the gene is preferably 
incorporated into a gene transfer vector adapted to express the 
Photorhabdus toxins in the plant cell by including in the vector a 
4 0 plant promoter. In addition to plant promoters, promoters from a 
variety of sources can be used efficiently in plant cells to 
express foreign genes. For example, promoters of bacterial origin, 
such as the octopine synthase promoter, the nopaline synthase 
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promoter, th^^lnnopine synthase promoter; promoters of viral 
origin, such as the cauliflower mosaic virus (35S and 19S) , 
reengineered 35S, known as 35T (see PCT/US96/16582 , WO 97/13402 
published April 17, 1997, which is incorporated herein by 
5 reference) and the like may be used. Plant promoters include, but 
are not limited to ribulose- 1 , 6 -bisphosphate (RUBP) carboxylase 
small subunit (ssu) , beta-conglycinin promoter, phaseolin promoter, 
ADH promoter, heat-shock promoters and tissue specific promoters. 
Promoters may also contain certain enhancer secpaence elements that 

10 may improve the transcription efficiency. Typical enhancers 
include but are not limited to Adh-intron 1 and Adh-intron 6. 
Constitutive promoters may be used. Constitutive promoters direct 
continuous gene expression in all cells types and at all times 
(e.g., actin, ubiquitin, CaMV 35S) . Tissue specific promoters are 

15 responsible for gene expression in specific cell or tissue types, 
such as the leaves or seeds (e.g., zein, oleosin, napin, ACP) and 
these promoters may also be used. Promoters may also be are active 
during a certain stage of the plants' development as well as active 
in plant tissues and organs. Examples of such promoters include 

2 0 but are not limited to pollen-specific, embryo specific, corn silk 
specific, cotton fiber specific, root specific, seed endosperm 
specific promoters and the like. 

Under certain circumstances it may be desirable to use an 
inducible promoter. An inducible promoter is responsible for 

2 5 expression of genes in response to a specific signal, such as : 
physical stimulus (heat shock genes); light (RUBP carboxylase); 
hormone (Em); metabolites; and stress. Other desirable 
transcription and translation elements that function in plants may 
be used. Numerous plant -specif ic gene transfer vectors are known 

30 to the art. 

In addition, it is known that to obtain high expression of 
bacterial genes in plants it is preferred to reengineer the 
bacterial genes so that they are more efficiently expressed in the 
cytoplasm of plants. Maize is one such plant where it is preferred 

35 to reengineer the bacterial gene(s) prior to transformation to 
increase the expression level of the toxin in the plant . One 
reason for the reengineering is the very low G+C content of the 
native bacterial gene(s) (and consequent skewing towards high A+T 
content) . This results in the generation of sequences mimicking or 

40 duplicating plant gene control sequences that are known to be 

highly A+T rich. The presence of some A+T- rich sequences within 
the DNA of the gene(s) introduced into plants (e.g., TATA box 
regions normally found in gene promoters) may result in aberrant 

-16- 

SUBSTTTUTE SHEET (RULE 26) 



BNSDOCID: <WO 9808932A1J_> 



WO 98/08932 



PCT/US97/07657 



10 



transcription of t^ene(s) . On the other hand,^Pfe presence of 
other regulatory sequences residing in the transcribed mRNA (e.g., 
polyadenylation signal sequences (AAUAAA) , or sequences 
complementary to small nuclear RNAs involved in pre -mRNA splicing) 
may lead to RNA instability. Therefore, one goal in the design of 
reengineered bacterial gene(s), more preferably referred to as 
plant optimized gene(s), is to generate a DNA sequence having a 
higher G+C content, and preferably one close to that of plant genes 
coding for metabolic enzymes. Another goal in the design of the 
plant optimized gene(s) is to generate a DNA sequence that not only 
has a higher G+C content, but by modifying the sequence changes, 
should be made so as to not hinder translation. 

An example of a plant that has a high G+C content is maize. 
The table below illustrates how high the G+C content is in maize. 
As in maize, it is thought that G+C content in other plants is also 
high . 

Table 2 

Compilation of G+C Contents of Protei n Coding Regions 

of Maise genes 



20 



1 

Protein Class a 


Range %G+C 


Mean %G+C b 


Metabolic Enzymes (40) 


44 .4-75.3 


59.0 (8.0) 


Storage Proteins 






Group I (23) ! 


46 .0-51 . 9 


48.1 (1.3) 


Group 11 (13) 


60.4-74 .3 


67.5 (3.2) 


Group I + II (36) 


46.0-74 .3 


55.1 (9.6) c 


Structural Proteins (18) 


48.6-70.5 


63.6 (6.7) 


Regulatory Proteins (5) 


57.2-68 .9 


62.0 (4.9) 


Uncharacterized Proteins (9) 


41.5-70.3 


64.3 (7.2) 


All Proteins (108) 


44 .4-75 .3 


60.8 (5.2) j 



a Number of genes in class given in parentheses. 
b Standard deviations given in parentheses. 
c Combined groups mean ignored in calculation of 
overall mean. 
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For thi^^ita in Table 2, coding regions of the genes were 
extracted from GenBank (Release 71) entries, and base compositions 
were calculated using the MacVector™ program (IBI, New Haven, CT) . 
Intron sequences were ignored in the calculations. Group I and II 
5 storage protein gene sequences were distinguished by their marked 
difference in base composition. 

Due to the plasticity afforded by the redundancy of the 
genetic code (i.e., some amino acids are specified by more than one 
codon) , evolution of the genomes of different organisms or classes 

10 or organisms has resulted in differential usage of redundant 

codons . This "codon bias" is reflected in the mean base composition 
of protein coding regions. For example, organisms with relatively 
low G+C contents utilize codons having A or T in the third position 
of redundant codons, whereas those having higher G+C contents 

15 utilize codons having G or C in the third position. It is thought 
that the presence of "minor" codons within a gene's mRNA may reduce 
the absolute translation rate of that mRNA, especially when the 
relative abundance of the charged tRNA corresponding to the minor 
codon is low. An extension of this is that the diminution of 

2 0 translation rate by individual minor codons would be at least 

additive for multiple minor codons. Therefore, mRNAs having high 
relative contents of minor codons would have correspondingly low 
translation rates. This rate would be reflected by the synthesis 
of low levels of the encoded protein. 

2 5 In order to reengineer the bacterial gene(s), the codon bias 

of the plant is determined. The codon bias is the statistical 
codon distribution that the plant uses for coding its proteins . 
After determining the bias, the percent frequency of the codons in 
the gene(s) of interest is determined. The primary codons 

30 preferred by the plant should be determined as well as the second 
and third choice of preferred codons . The amino acid sequence of 
the protein of interest is reverse translated so that the resulting 
nucleic acid sequence codes for the same protein as the native 
bacterial gene, but the resulting nucleic acid sequence corresponds 

35 to the first preferred codons of the desired plant. The new 

sequence is analyzed for restriction enzyme sites that might have 
been created by the modification. The identified sites are further 
modified by replacing the codons with second or third choice 
preferred codons. Other sites in the sequence which could affect 

4 0 the transcription or translation of the gene of interest are the 
exon: intron 5' or 3 ' junctions, poly A addition signals, or RNA 
polymerase termination signals. The sequence is further analyzed 
and modified to reduce the frequency of TA or GC doublets. In 
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than about four residues that are the same can affect transcription 
of the sequence. Therefore, these blocks are also modified by 
replacing the codons of first or second choice, etc. with the next 
preferred codon of choice. It is preferred that the plant 
optimized gene(s) contains about 63% of first choice codons, 
between about 22% to about 37% second choice codons, and between 
15% and 0% third choice codons, wherein the total percentage is 
100%. Most preferred the plant optimized gene(s) contain about 63% 
of first choice codons, at least about 22% second choice codons, 
about 7.5% third choice codons, and about 7.5% fourth choice 
codons, wherein the total percentage is 100%. The method 
described above enables one skilled in the art to modify gene(s) 
that are foreign to a particular plant so that the genes are 
optimally expressed in plants. The method is further illustrated 
in application PCT/US96/ 16582 , WO 97/13402 published April 17, 
1997 . 

Thus, in order to design plant optimized gene<s) the amino 
acid sequence of the toxins are reverse translated into a DNA 
sequence, utilizing a nonredundant genetic code established from a 
codon bias table compiled for the gene DNA sequence for the 
particular plant being transformed. The resulting DNA sequence, 
which is completely homogeneous in codon usage, is further modified 
to establish a DNA sequence that, besides having a higher degree of 
codon diversity, also contains strategically placed restriction 
enzyme recognition sites, desirable base composition, and a lack of 
sequences that might interfere with transcription of the gene, or 
translation of the product mRNA. 

It is theorized that bacterial genes may be more easily 
expressed in plants if the bacterial genes are expressed in the 
plastids . Thus, it may be possible to express bacterial genes in 
plants, without optimizing the genes for plant expression, and 
obtain high express of the protein. See U.S. Patent Nos . 
4,762,785; 5,451,513 and 5,545,817, which are incorporated herein 
by reference. 

One of the issues regarding commercial exploiting transgenic 
plants is resistance management. This is of particular concern 
with Bacillus thuringiensis toxins. There are numerous companies 
commerically exploiting Bacillus thuringiensis and there has been 
much concern about Bt toxins becoming resistant. One strataegy for 
insect resistant management would be to combine the toxins produced 
by Photorhabdus with toxins such as Bt, vegetative insect proteins 
(Ciba Geigy) or other toxins. The combinations could be formulated 
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for a sprayabTe application or could be molecular combinations. 
Plants could be transformed with Photorhabdus genes that produce 
insect toxins and other insect toxin genes such as Bt as with other 
insect toxin genes such as Bt. 



of 2 Bt in a plant, which could be any 2 genes. Another way to 
produce a transgenic plant that contains more than one insect 
resistant gene would be to produce two plants, with each plant 
containing an insect resistant gene. These plants would be 

10 backcrossed using traditional plant breeding techniques to produce 
a plant containing more than one insect resistant gene. 

In addition to producing a transformed plant containing plant 
optimized gene(s), there are other delivery systems where it may be 
desirable to reengineer the bacterial gene(s). Along the same 

15 lines, a genetically engineered, easily isolated protein toxin 
fusing together both a molecule attractive to insects as a food 
source and the insecticidal activity of the toxin may be engineered 
and expressed in bacteria or in eukaryotic cells using standard, 
well-known techniques. After purification in the laboratory such a 

20 toxic agent with "built-in" bait could be packaged inside standard 
insect trap housings. 

Another delivery scheme is the incorporation of the genetic 
material of toxins into a baculovirus vector. Baculoviruses infect 
particular insect hosts, including those desirably targeted with 

2 5 the Photorhabdus toxins. Infectious baculovirus harboring an 
expression construct for the Photorhabdus toxins could be 
introduced into areas of insect infestation to thereby intoxicate 
or poison infected insects. 



30 sequences encoding the coding the amino acid sequences for the 
Photorhabdus toxins integrated into a protein expression vector 
appropriate to the host in which the vector will reside. One way 
to obtain a nucleic acid sequence encoding a protein with 
insecticidal properties is to isolate the native genetic material 

35 which produces the toxins from Photorhabdus, using information 
deduced from the toxin's amino acid sequence, large portions of 
which are set forth below. As described below, methods of 
purifying the proteins responsible for toxin activity are also 
disclosed . 

4 0 Using N- terminal amino acid sequence data, such as set forth 

below, one can construct oligonucleotides complementary to all, or 
a section of, the DNA bases that encode the first amino acids of 
the toxin. These oligonucleotides can be radiolabeled and used as 



5 



European Patent Application 0400246A1 describes transformation 



Transfer of the insecticidal properties requires nucleic acid 
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genetic library built from genetic material isolated from strains 
of Photorhabdus . The genetic library can be cloned in plasmid, 
cosmid, phage or phagemid vectors. The library could be 
transformed into Escherichia coli and screened for toxin production 
by the transformed cells using antibodies raised against the toxin 
or direct assays for insect toxicity. 

This approach requires the production of a battery of 
oligonucleotides, since the degenerate genetic code allows an amino 
acid to be encoded in the DNA by any of several three-nucleot ide 
combinations. For example, the amino acid arginine can be encoded 
by nucleic acid triplets CGA, CGC, CGG , CGT , AG A, and AGG. Since 
one cannot predict which triplet is used at those positions in the 
toxin gene, one must prepare oligonucleotides with each potential 
triplet represented. More than one DNA molecule corresponding to a 
protein subunit may be necessary to construct a sufficient number 
of oligonucleotide probes to recover all of the protein subunits 
necessary to achieve oral toxicity. 

From the amino acid sequence of the purified protein, genetic 
materials responsible for the production of toxins can readily be 
isolated and cloned, in whole or in part, into an expression vector 
using any of several techniques well-known to one skilled in the 
art of molecular biology. A typical expression vector is a DNA 
plasmid, though other transfer means including, but not limited to, 
cosmids, phagemids and phage are also envisioned. In addition to 
features required or desired for plasmid replication, such as an 
origin of replication and antibiotic resistance or other form of a 
selectable marker such as the bar gene of Streptomyces 
hygroscopicus or viridochromogenes , protein expression vectors 
normal': y addi t i ona 1 1 y require an expression cassette which 
incorporates the cis -acting sequences necessary fcrr-fc^r-anscription 
and translation of the gene of interest. The cis-acting sequences 
required for expression in prokaryotes differ from those required 
in eukaryotes and plants. 

A eukaryotic expression cassette requires a transcriptional 
promoter upstream (5') to the gene of interest, a transcriptional 
termination region such as a poly-A addition site, and a ribosome 
binding site upstream of the gene of interest's first codon. In 
bacterial cells, a useful transcriptional promoter that could be 
included in the vector is the T7 RNA Polymerase-binding promoter. 
Promoters, as previously described herein, are known to efficiently 
promote transcription of mRNA. Also upstream from the gene of 
interest the vector may include a nucleotide sequence encoding a 
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particular compartment of the host cells such as the cell surface. 

Insect viruses, or baculoviruses , are known to infect and 
adversely affect certain insects. The affect of the viruses on 
insects is slow, and viruses do not stop the feeding of insects. 
Thus viruses are not viewed as being useful as insect pest control 
agents . Combining the Photorhabdus toxins genes into a baculovirus 
vector could provide an efficient way of transmitting the toxins 
while increasing the lethality of the virus. In addition, since 
different baculoviruses are specific to different insects, it may 
be possible to use a particular toxin to selectively target 
particularly damaging insect pests. A particularly useful vector 
for the toxins genes is the nuclear polyhedrosis virus. Transfer 
vectors using this virus have been described and are now the 
vectors of choice for transferring foreign genes into insects. The 
virus -toxin gene recombinant may be constructed in an orally 
transmissible form. Baculoviruses normally infect insect victims 
through the mid-gut intestinal mucosa. The toxin gene inserted 
behind a strong viral coat protein promoter would be expressed and 
should rapidly kill the infected insect. 

In addition to an insect virus or baculovirus or transgenic 
plant delivery system for the protein toxins of the present 
invention, the proteins may be encapsulated using Bacillus 
thuringiensis encapsulation technology such as but not limited to 
U.S. Patent Nos. 4,695,455; 4,695,462; 4,861,595 which are all 
incorporated herein by reference. Another delivery system for the 
protein toxins of the present invention is formulation of the 
protein into a bait matrix, which could then be used in above and 
below ground insect bait stations. Examples of such technology 
include but are not limited to PCT Patent Application WO 93/23996, 
which is incorporated herein by reference. 

As is described above, it might become necessary to modify the 
sequence encoding the protein when expressing it in a non-native 
host, since the codon preferences of other hosts may differ from 
that of Photorhabdus . In such a case, translation may be quite 
inefficient in a new host unless compensating modifications to the 
coding sequence are made. Additionally, modifications to the amino 
acid sequence might be desirable to avoid inhibitory cross - 
reactivity with proteins of the new host, or to refine the 
insecticidal properties of the protein in the new host. A 
genetically modified toxin gene might encode a toxin exhibiting, 
for example, enhanced or reduced toxicity, altered insect 
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species specificity. 

In addition to the Photorhabdus genes encoding the toxins, the 
scope of the present invention is intended to include related 
nucleic acid sequences which encode amino acid biopolymers 
homologous to the toxin proteins and which retain the toxic effect 
of the Photorhabdus proteins in insect species after oral 
ingestion. 

For instance, the toxins used in the present invention seem to 
first inhibit larval feeding before death ensues. By manipulating 
the nucleic acid sequence of Photorhabdus toxins or its controlling 
sequences, genetic engineers placing the toxin gene into plants 
could modulate its potency or its mode of action to, for example, 
keep the eating- inhibitory activity while eliminating the absolute 
toxicity to the larvae. This change could permit the transformed 
plant to survive until harvest without having the unnecessarily 
dramatic effect on the ecosystem of wiping out all target insects. 
All such modifications of the gene encoding the toxin, or of the 
protein encoded by the gene, are envisioned to fall within the 
scope of the present invention. 

Other envisioned modifications of the nucleic acid include the 
addition of targeting sequences to direct the toxin to particular 
parts of the insect larvae for improving its efficiency. 

Strains W-14, ATCC 55397, 43948, 43949, 43950, 43951, 43952 
have been deposited in the American Type Culture Collection, 12301 
Parklawn Drive, Rockville, MD 20852 USA. Amino acid and nucleotide 
sequence data for the W-14 native toxin {ATCC 55397) is presented 
below. Isolation of the genomic DNA for the toxins from the 
bacterial hosts is also exemplified herein. The other strains 
identified herein have been deposited with the United States 
Department of Agriculture, 1815 North University Drive, Peoria, IL 
61604 . 

Standard and molecular biology techniques were followed and 
taught in the specification herein. Additional information may be 
found in Sambrook, J., Fritsch, E. F . , and Maniatis, T. (1989), 
Molecular Cloning. A La boratory Manual. Cold Spring Harbor Press; 
Current Protocal sin Molecular Biology, ed . F. M. Ausubel et al . , 
(1997), which are both incorporated herein by reference. 

The following abbreviations are used throughout the Examples: Tris 
= tris (hydroxymethyl) amino methane; SDS = sodium dodecyl sulfate; 
EDTA = ethylenediaminetetraacetic acid, IPTG = isopropylthio-B- 
galactoside , X-gal = 5 -bromo-4 -chloro- 3 - indoyl -B-D-galactoside , 
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CTAB - cetyltrimethylammonium bromide; kbp = kilobase pairs; dATP, 
dCTP, dGTP, dTTP , I = 2 * -deoxynucleoside 5 ' -triphosphates of 
adenine, cytosine, guanine, thymine, and inosine, respectively; ATP 
= adenosine 5' triphosphate. 

5 

Example 1 

Purific a t i on Qf TQXin from P hoto r habdus lumlnesnanx anH 

Demonstration of Toxicity after Oral Delivery of Purified xqxIq 

10 The insect icidal protein toxin of the present invention was 

purified from PhotorJaajbdus lumlnescens strain W-14, ATCC Accession 
Number 55397. Stock cultures of Photorhabdus lumlnescens were 
maintained on petri dishes containing 2% Proteose Peptone No. 3 
{i.e., PP3 , Difco Laboratories, Detroit MI) in 1.5% agar, incubated 

15 at 25°C and transferred weekly. Colonies of the primary form of 

the bacteria were inoculated into 200 ml of PP3 broth supplemented 
with 0.5% polyoxyethylene sorbitan mono-stearate (Tween 60 , Sigma 
Chemical Company, St. Louis, MO) in a one liter flask. The broth 
cultures were grown for 72 hours at 3 0°C on a rotary shaker. The 

2 0 toxin proteins can be recovered from cultures grown in the presence 
or absence of Tween; however, the absence of Tween can affect the 
form of the bacteria grown and the profile of proteins produced by 
the bacteria. In the absence of Tween, a variant shift occurs 
insofar as the molecular weight of at least one identified toxin 

2 5 subunit shifts from about 20 0 kDa to about 185 kDa . 

The 72 hour cultures were centrifuged at 10,000 x g for 3 0 
minutes to remove cells and debris. The supernatant fraction that 
contained the insecticidal activity was decanted and brought to 50 
mM K 2 HP0 4 by adding an appropriate volume of 1.0 M K,,HPO, . The pH 

30 was Adjusted to 8.6 by adding potassium hydroxide. This 

supernatant fraction was then mixed with DEAE-Sephatrt=ri (Pharmacia 
LKB Biotechnology) which had been equilibrated with 50 mM K 2 HPO< . 
The toxic activity was adsorbed to the DEAE resin. This mixture 
was then poured into a 2.6 x 40 cm column and washed with 50 mM 

35 K 2 HPO, at room temperature at a flow rate of 30 ml/hr until the 

effluent reached a steady baseline UV absorbance at 280 nm. The 
column was then washed with 150 mM KC1 until the effluent again 
reached a steady 280 nm baseline. Finally the column was washed 
with 300 mM KC1 and fractions were collected. 

4 0 Fractions containing the toxin were pooled and filter 

sterilized using a 0.2 micron pore membrane filter. The toxin was 
then concentrated and equilibrated to 100 mM KP0 4 , pH 6.9, using an 
ultrafiltration membrane with a molecular weight cutoff of 100 kDa 
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3 ml sample of the toxin concentrate was applied to the top of a 
2.6 x 95 cm Sephacryl S-400 HR gel filtration column (Pharmacia LKB 
Biotechnology). The eluent buffer was 100 mM KP0«, pH 6.9, which 
was run at a flow rate of 17 ml/hr, at 4°C. The effluent was 
monitored at 280 ran. 

Fractions were collected and tested for toxic activity. 
Toxicity of chromatographic fractions was examined in a biological 
assay using Manduca sexta larvae. Fractions were either applied 
directly onto the insect diet (Gypsy moth wheat germ diet, ICN 
Biochemicals Division - ICN Biomedicals, Inc.) or administered by 
intrahemocelic injection of a 5 pi sample through the first proleg 
of 4th or 5th instar larva using a 30 gauge needle. The weight of 
each larva within a treatment group was recorded at 24 hour 
intervals. Toxicity was presumed if the insect ceased feeding and 
died within several days of consuming treated insect diet or if 
death occurred within 24 hours after injection of a fraction. 

The toxic fractions were pooled and concentrated using the 
Centriprep-100 and were then analyzed by HPLC using a 7.5 mm x 60 
cm TSK-GEL G-4000 SW gel permeation column with 100 mM potassium 
phosphate, pH 6 . 9 eluent buffer running at 0.4 ml/min. This 
analysis revealed the toxin protein to be contained within a single 
sharp peak that eluted from the column with a retention time of 
approximately 33.6 minutes. This retention time corresponded to an 
estimated molecular weight of 1,000 kDa . Peak fractions were 
collected for further purification while fractions not containing 
this protein were discarded. The peak eluted from the HPLC absorbs 
UV light at 218 and 280 nm but did not absorb at 405 nm. 
Absorbance at 405 nm was shown to be an attribute of xenorhabdin 
antibiotic compounds . 

Electrophoresis of the pooled peak fractions in a non- 
denaturing agarose gel (Metaphor Agarose, FMC BioProducts) showed 
that two protein complexes are present in the peak. The peak 
material, buffered in 50 mM Tris-HCl, pH 7.0, was separated on a 
1.5% agarose stacking gel buffered with 100 mM Tris-HCl at pH 7.0 
and 1.9% agarose resolving gel buffered with 200 mM Tris-borate at 
pH 8.3 under standard buffer conditions (anode buffer 1M Tris-HCl, 
pH 8.3; cathode buffer 0.025 M Tris, 0.192 M glycine) . The gels 
were run at 13 mA constant current at 15°C until the phenol red 
tracking dye reached the end of the gel. Two protein bands were 
visualized in the agarose gels using Coomassie brilliant blue 
staining . 
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The slc^^ migrating band was referred to as "protein band 1" 
and faster migrating band was referred to as "protein band 2." The 
two protein bands were present in approximately equal amounts. The 
Coomassie stained agarose gels were used as a guide to precisely 
excise the two protein bands from unstained portions of the gels. 
The excised pieces containing the protein bands were macerated and 
a small amount of sterile water was added. As a control, a portion 
of the gel that contained no protein was also excised and treated 
in the same manner as the gel pieces containing the protein. 
Protein was recovered from the gel pieces by electroelution into 
100 mM Tris-borate pH 8.3, at 100 volts (constant voltage) for two 
hours. Alternatively, protein was passively eluted from the gel 
pieces by adding an equal volume of 50 mM Tris-HCl, pH 7.0, to the 
gel pieces, then incubating at 30°C for 16 hours. This allowed the 
15 protein to diffuse from the gel into the buffer, which was then 
collected. 

Results of insect toxicity tests using HPLC-purif ied toxin 
(33.6 min. peak) and agarose gel purified toxin demonstrated 
toxicity of the extracts. Injection of 1.5 jig of the HPLC purified 
20 protein kills within 24 hours. Both protein bands 1 and 2, 

recovered from agarose gels by passive elution or electroelution, 
were lethal upon injection. The protein concentration estimated 
for these samples was less than 50 ng/larva. A comparison of the 
weight gain and the mortality between the groups of larvae injected 
25 with protein bands 1 or 2 indicate that protein band 1 was more 
toxic by injection delivery. 

When HPLC-purif ied toxin was applied to larval diet at a 
concentration of 7 . 5 ^g/ larva, it caused a halt in larval weight 
gain (24 larvae tested) . The larvae begin to feed, but after 
30 consuming only a very small portion of the toxin treated diet they 
began to show pathological symptoms induced by the toxin and the 
larvae cease feeding. The insect frass became discolored and most 
larva showed signs of diarrhea. Significant insect mortality 
resulted when several 5 /ig toxin doses were applied to the diet 
3 5 over a 7-10 day period. 

Agarose -separated protein band 1 significantly inhibited 
larval weight gain at a dose of 200 ng/larva. Larvae fed similar 
concentrations of protein band 2 were not inhibited and gained 
weight at the same rate as the control larvae. Twelve larvae_were 
fed eluted protein and 4 5 larvae were fed protein-containing 
agarose pieces. These two sets of data indicate that protein band 
1 was orally toxic to Manduca sexta. In this experiment it 
appeared that protein Land 2 was not toxic to Manduca sexta. 
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Further analy3f5 of protein bands 1 and 2 b^^»S-PAGE under 
denaturing conditions showed that each band was composed of several 
smaller protein subunits. Proteins were visualized by Coomassie 
brilliant blue staining followed by silver staining to achieve 
5 maximum sensitivity. 

The protein subunits in the two bands were very similar. 
Protein band 1 contains 8 protein subunits of 25.1, 56.2, 60.8, 
65.6, 166, 171, 184 and 208 kDa. Protein band 2 had an identical 
profile except that the 25.1, 60.8, and 65.6 kDa proteins were not 
10 present. The 56.2, 60.8, 65.6, and 184 kDa proteins were present 
in the complex of protein band 1 at approximately equal 
concentrations and represent 80% or more of the total protein 
content of that complex. 

The native HPLC-purif ied toxin was further characterized as 
15 follows. The toxin was heat labile in that after being heated to 
60°C for 15 minutes it lost its ability to kill or to inhibit 
weight gain when injected or fed to Manduca sexta larvae. Assays 
were designed to detect lipase, type C phospholipase , nuclease or 
red blood cell hemolysis activities and were performed with 
20 purified toxin. None of these activities were present. Antibiotic 
zone inhibition assays were also done and the purified toxin failed 
to inhibit growth of Gram-negative or -positive bacteria, yeast or 
filamentous fungi, indicating that the toxic is not a xenorhabdin 
antibiotic . 

25 The native HPLC-purif ied toxin was tested for ability to kill 

insects other than Manduca sexta. Table 3 lists insects killed by 
the HPLC-purif ied Photorhabdus luminescens toxin in this study. 
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Table 3 

Insects Killed b y Photorhabdus luminescent TPX i n 



Common Name 

3 5 Tobacco 

horn worm 

Mealworm 

4 0 Pharaoh ant 

German 
cockroach 



Qrder 

Lepidoptera 

Coleoptera 

Hymenoptera 

Dictyoptera 



Genus and 

species 

Manduca sexta 



Tenebrio molitor 
Monomorium pharoanis 
Blattella germanica 



Route of 
Delivery 

Oral and 
injected 

Oral 

Oral 

Oral and 
injected 



4 5 Mosquito 



Diptera 



Aedes aegypti 



Oral 
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Further ChafgC terizatiPIl of the High Molecular Weight Toxin Complex 

In yet further analysis, the toxin protein complex was 
subjected to further characterization from W-14 growth medium. The 
culture conditions and initial purification steps through the S-400 
HR column were identical to those described above. After isolation 
of the high molecular weight toxin complex from the S-400 HR column 
fractions, the toxic fractions were equilibrated with 10 mM- Tris- 
HC1, pH 8.6, and concentrated in the centriplus 100 (Amicon) 
concentrators. The protein toxin complex was then applied to a weak 
anion exchange (WAX) column, Vydac 301VPH575 (Hesparia, CA) , at a 
flow rate of 0.5 ml/min. The proteins were eluted with a linear 
potassium chloride gradient, 0-250 mM KC1 , in 10 mM Tris-HCl pH 8.6 
for 50 min. Eight protein peaks were detected by absorbance at 280 
nm. 

Bioassays using neonate southern corn rootworm {Diabrotica 
undecimpunctata howardi , SCR) larvae and tobacco horn worm (Manduca 
sexta, THW) were performed on all fractions eluted from the HPLC 
column. THW were grown on Gypsy Moth wheat germ diet (ICN) at 25°C 
with a 16 hr light 8 hr dark cycle. SCR were grown on Southern Corn 
Rootworm Larval Insecta-Diet (BioServ) at 25°C with a 16 hr light / 
8 hr dark cycle. 

The highest mortality for SCR and THW larvae was observed for 
peak 6, which eluted with ca. 112 mM to 132mM KC1 . SDS-PAGE 
analysis of peak 6 showed predominant peptides of 170 kDa, 66 kDa, 
63 kDa, 59.5 kDa and 31 kDa. Western blot analysis was performed on 
peak 6 protein fraction with a mixture of polyclonal antibodies made 
against TcaA^-syn, TcaA ii:L -syn, TcaB i:i -syn, TcaC-syn, and TcbA i:L -syn 
peptides (described in Example 21) and C5F2 , a monoclonal antibody 
against the TcbA ii:i peptide. Peak 6 contained immuno- reactive bands 
of 17 0 kDa, 9 0 kDa, 66 kDa, 5 9.5 kDa and 31 kDa. These are very 
close to the predicted sizes for the TcaC (166 kDa), TcaA i:L + TcaA iii 

(92 kDa), TcaA 11± (66 kDa), TcaB i:L (60 kDa) and TcaA i:i (25 kDa), 

respectively. Peak 6 which was further analyzed by native agarose 
gel electrophoresis, as described herein, migrated as a single band 
with similar mobility to that of band 1. 

The protein concentration of the purified peak 6 toxin protein 
was determined using the BCA reagents (Pierce) . Dilutions of the 
protein were made in 10 mM Tris, pH 8 . 6 and applied to the diet 
bioassays. After 240 hours all neonate larvae on diet bioassays 
that received - ng or greater of the peak 6 protein fraction were 
dead. The grou of larvae that received 90 ng of the same fraction 
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had 40% mortality, ^^fter 240 hrs the survivors tWrc received 90 ng 
and 20 ng of peak 6 protein fraction were ca". 10% and 70%, 
respectively, of the control weight. 
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The Photorhabdus luminescens utility and toxicity were further 
characterized. Photorhabdus luminescens (strain W-14) culture 
broth was produced as follows. The production medium was 2% Bacto 
Proteose Peptone* Number 3 ( PP3 , Difco Laboratories, Detroit, 
Michigan) in Milli-Q* deionized water. Seed culture flasks 
consisted of 175 ml medium placed in a 500 ml tribaffled flask with 
a Delong neck, covered with a Kaput and autoclaved for 20 minutes, 
T=250°F. Production flasks consisted of 500 mis in a 2 . 8 liter 500 
ml tribaffled flask with a Delong neck, covered by a Shin-etsu 
silicon foam closure. These were autoclaved for 45 minutes, 
T=250"F. The seed culture was incubated at 28°C at 150 rpm in a 
gyrotory shaking incubator with a 2 inch throw. After 16 hours of 
growth, 1% of the seed culture was placed in the production flask 
which was allowed to grow for 24 hours before harvest. Production 
of the toxin appears to be during log phase growth. The microbial 
broth was transferred to a 1L centrifuge bottle and the cellular 
biomass was pelleted {30 minutes at 2500 RPM at 4*C, [R.C.F. = about 
1600] HG-4L Rotor RC3 Sorval centrifuge, Dupont , Wilmington, DE) . 
The primary broth was chilled at 4*C for 8-16 hours and 
recentrif uged at least 2 hours (conditions above) to further 
clarify the broth by removal of a putative mucopolysaccharide which 
precipitated upon standing. (An alternative processing method 
combined both steps and involved the use of a 16 hour 
clarification centrif ugation, same conditions as above.) This 



broth was then stored at 4 C prior to bioassay or filtration. 

Photorhabdus culture broth and protein toxin (s) purified from 
this broth showed activity (mortality and/or growth inhibition, 
reduced adult emergence) against a number of insects. More 
specifically, the activity is seen against corn rootworm (larvae 
and adult) , Colorado potato beetle, and turf grubs, which are 
members of the insect order Coleoptera. Other members of the 
Coleoptera include wireworms, pollen beetles, flea beetles, seed 
beetles and weevils. Activity has also been observed against aster 
leaf hopper, which is a member of the order, Homoptera . Other 
members of the Homoptera include planthoppers , pear pyslla, apple 
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sucker, scal^^isects, whiteflies, and spittle bugs, as well as 
numerous host specific aphid species. The broth and purified 
fractions are also active against beet armyworm, cabbage looper, 
black cutworm, tobacco budworm, European corn borer, corn earworm, 
5 and codling moth, which are members of the order Lepidoptera. 
Other typical members of this order are clothes moth, Indian 
mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, 
Eastern tent caterpillar, sod webworm, and fall armyworm. Activity 
is also seen against fruitfly and mosquito larvae, which are 

10 members of the order Diptera. Other members of the order Diptera 

are pea midge, carrot fly, cabbage root fly, turnip root fly, onion 
fly, crane fly, house fly, and various mosquito species. Activity 
is seen against carpenter ant and Argentine ant, which are members 
of the order that also includes fire ants, oderous house ants, and 

15 little black ants. 

The broth/fraction is useful for reducing populations of 
insects and were used in a method of inhibiting an insect 
population. The method may comprise applying to a locus of the 
insect an effective insect inactivating amount of the active 

20 described. Results are reported in Table 4. 

Activity against corn rootworm larvae was tested as follows. 
Photorhabdus culture broth (filter sterilized, cell-free) or 
purified HPLC fractions were applied directly to the surface (about 
1.5 cm 2 ) of 0.25 ml of artificial diet in 30 j±l alxquots following 

2 5 dilution in control medium or 10 mM sodium phosphate buffer, pH 
7.0, respectively. The diet plates were allowed to air-dry in a 
sterile flow-hood and the wells were infested with single, neonate 
Diabrotica undecimpunctata howardi (Southern corn rootworm, SCR) 
hatched from sterilized eggs, with second instar SCR grown on 

30 artificial diet or with second instar Diabrotica virgifera 

virgifera (Western corn rootworm, WCR) reared on corn seedlings 
grown in Metromix*. Second instar larvae were weighed prior to 
addition to the diet. The plates were sealed, placed in a 
humidified growth chamber and maintained at 27°c for the 

35 appropriate period (4 days for neonate and adult SCR, 2-5 days for 
WCR larvae, 7-14 days for second instar SCR) . Mortality and weight 
determinations were scored as indicated. Generally, 16 insects per 
treatment were used in all studies . Control mortalities were as 
follows: neonate larvae, <5%, adult beetles, 5%. 
40 Activity against Colorado potato beetle was tested as follows. 

Photorhabdus culture broth or control medium was applied to the 
surface (about 2.0 cm 2 ) of 1 . 5 ml of standard artificial diet held 
in the wells of a 24 -well tissue culture plate. Each well received 
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50 |il of treatment ami was allowed to air dry. Ind^^dual second 
instar Colorado potato beetle (Leptinotarsa decemlineata , CPB) 
larvae were then placed onto the diet and mortality was scored 
after 4 days. Ten larvae per treatment were used in all studies. 
5 Control mortality was 3.3%. 

Activity against Japanese beetle grubs and beetles was tested 
as follows. Turf grubs (Popillia japonica, 2-3rd instar) were 
collected from infested lawns and maintained in the laboratory in 
soil/peat mixture with carrot slices added as additional diet. 
10 Turf beetles were pheromone- trapped locally and maintained in the 
laboratory in plastic containers with maple leaves as food. 
Following application of undiluted Photorhabdus culture broth or 
control medium to corn rootworm artificial diet (30 ul/1.54 cm 2 , 
beetles) or carrot slices (larvae) , both stages were placed singly 
15 ma diet well and observed for any mortality and feeding. In both 
cases there was a clear reduction in the amount of feeding (and 
feces production) observed. 

Activity against mosquito larvae was tested as follows. The 
assay was conducted in a 96-well microtiter plate. Each well 
20 contained 200 ul of aqueous solution (Photorhabdus culture broth, 

control medium or H 2 0) and approximately 20, 1-day old larvae (Aedes 
aegypti) . There were 6 wells per treatment. The results were read 
at 2 hours after infestation and did not change over the three day 
observation period. No control mortality was seen. 

2 5 Activity against fruitflies was tested as follows. Purchased 

Drosophila melanogaster medium was prepared using 50% dry medium 
and a 50% liquid of either water, control medium or Photorhabdus 
culture broth. This was accomplished by placing 8.0 ml of dry 
medium in each of 3 rearing vials per treatment and adding 8 . 0 ml 
30 of the appropriate liquid. Ten late instar Drosophila melanogaster 
maggots were then added to each vial. The vials were held on a 
laboratory bench, at room temperature, under fluorescent ceiling 
lights. Pupal or adult counts were made after 3, 7 and 10 days of 
exposure. Incorporation of Photorhabdus culture broth into the 

3 5 diet media for fruitfly maggots caused a slight (17%) but 

significant reduction in day- 10 adult emergence as compared to 
water and control medium (3% reduction) . 

Activity against aster leafhopper was tested as follows. The 
ingestion assay for aster leafhopper {Macrosteles severini) is 

4 0 designed to allow ingestion of the active without other external 

contact. The reservoir for the active/ " food" solution is made by 
making 2 holes in the center of the bottom portion of a 35 x 10 mm 
Petri dish. A 2 inch Parafilm NT square is placed across the top of 
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the dish andWf cured with an "O" ring. A lo z . plastic cup is then 
infested with approximately 7 leafhoppers and the reservoir is 
placed on top of the cup, Parafilm down. The test solution is then 
added to the reservoir through the holes. In tests using undiluted 
Photorhabdus culture broth , the broth and control medium were 
dialyzed against water to reduce control mortality. Mortality is 
reported at day 2 where 26.5% control mortality was seen. In the 
tests using purified fractions (200 mg protein/ml) a final 
concentration of 5% sucrose was used in all treatments to improve 
survivability of the aster leafhoppers. The assay was held in an 
incubator at 28°C, 70% RH with a 16/8 photoperiod. The assay was 
graded for mortality at 72 hours. Control mortality was 5.5%. 

Activity against Argentine ants was tested as follows. A 1.5 
ml aliquot of 100% Photorhabdus culture broth, control medium or 
water was pipetted into 2 . 0 ml clear glass vials. The vials were 
plugged with a piece of cotton dental wick that was moistened with 
the appropriate treatment. Each vial was placed into a separate 
60x16mm Petri dish with 8 to 12 adult Argentine ants (Linepithema 
humile) . There were three replicates per treatment. Bioassay 
plates were held on a laboratory bench, at room temperature under 
fluorescent ceiling lights. Mortality readings were made after 5 
days of exposure. Control mortality was 24%. 

Activity against carpenter ant was tested as follows. Black 
carpenter ant workers (Camponotus pennsylvanicus) were collected 
from trees on DowElanco property in Indianapolis, IN. Tests with 
Photorhabdus culture broth were performed as follows. Each plastic 
bioassay container (7 1/8" x 3") held fifteen workers, a paper 
harborage and 10 ml of broth or control media in a plastic shot 
glass. A cotton wick delivered the treatment to the ants through a 
hole in the shot glass lid. All treatments contained 5% sucrose. 
Bioassays were held in the dark at room temperature and graded at 
19 days. Control mortality was 9%. Assays delivering purified 
fractions utilized artificial ant diet mixed with the treatment 
(purified fraction or control solution) at a rate of 0.2 ml 
treatment/2.0 g diet in a plastic test tube. The final protein 
concentration of the purified fraction was less than 10 ug/g diet. 
Ten ants per treatment, a water source, harborage and the treated 
diet were placed in sealed plastic containers and maintained in the 
dark at 27°C in a humidified incubator. Mortality was scored at 
day 10. No control mortality was seen. 

Activity against various lepidopteran larvae was tested as 
follows. Photorhabdus culture broth or purified fractions were 
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applied directly to 




e surface (abojut 1.5 cm 2 ) oS^2 5 ml of 




10 



standard artificial diet in 30 )il aliquots following dilution in 
control medium or 10 mM sodium phosphate buffer, pH 7.0, 
respectively. The diet plates were allowed to air-dry in a sterile 
flow-hood and the wells were infested with single, neonate larva. 
European corn borer (Ostrinia nubilalis) and corn earworm 
(Helicoverpa zea) eggs were supplied from commercial sources and 
hatched in-house, whereas beet armyworm (Spodoptera exigua) , 
cabbage looper {Trichoplusia ni) , tobacco budworm (Hellothis 
virescens) , codling moth {Laspeyresia pomonella) and black cutworm 
(Agrotis ipsilon) larvae were supplied internally. Following 
infestation with larvae, the diet plates were sealed, placed in a 
humidified growth chamber and maintained in the dark at 27 °C for 
the appropriate period. Mortality and weight determinations were 
scored at days 5-7 for Photorhabdus culture broth and days 4-7 for 
the purified fraction. Generally, 16 insects per treatment were 
used in all studies. Control mortality ranged from 4-12.5% for 
control medium and was less than 10% for phosphate buffer. 
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Table, 4 

Effect of Phocorhabdus luminescens (Strain W-14) 

Culture Broth and Purified Toxin Fract i on on M o r tali t y . and Gr ow th 
Inhibition of Pi£ferent Insect Orders/Species 



insect order/ species 


broth 


purinea 


Fraction 




% Mort . 


% G . 1 . 


* Mort. 


% U.l. 


COLKOPTERA 










Corn Rootworm 










Southern/neonate larva 


100 


na 


100 


na 


Southern/ 2 nd ins tar 


na 


38.5 


nt 


nt 


Southern/ adult 


45 


nt 


nt 


nt 


Western/2 nd instar 


na 


35 


nt 


nt 


Colorado Potato Beetle 


93 


nt 


nt 


nt 


2 nd instar 










Turf Grub 


na 


a.f . 


nt 


nt 


3 rd instar 


na 


a.f . 


nt 


nt 


adult 










DIPTERA 










Fruit Fly (adult emergence) 


17 


nt 


nt 


nt 


Mosquito larvae 


100 


na 


nt 


nt 


HOMOPTERA 










Aster Leafhopper 


96.5 


na 


100 


na 


HYMENOPTEKA 











Argentine Ant 


75 


na 


nt 


na 


Carpenter Ant 


71 


na 


100 


na 


IiEPIDOPTERA 










Beet Armyworm 


12 .5 


36 


18 .75 


41.4 


Black Cutworm 


nt 


nt 


0 


71.2 


Cabbage Looper 


nt 


nt 


21.9 


66 .8 


Codling Moth 


nt 


nt 


6 .25 


45 .9 


Corn Ear worm 


56.3 


94 .2 


97 .9 


na 


European Corn Borer 


96 . 7 


98 .4 


100 


na 


Tobacco Budworm 


13 .5 


52 .5 


19.4 


85 .6 



na = not applicable, nt = not tested, a.f. = anti-feedant 



10 



Example 3 

Insecticide Utility upon Soil Application 



Photorhabdus luminescens (strain W-14) culture broth was shown 
to be active against corn rootworm when applied directly to soil or 
a soil -mix (Metromix") . Activity against neonate SCR and WCR in 
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Metromix* was tested 




follows (Table 5) . The te 




fas run using 



10 



15 



2C 



25 



30 



corn seedlings (United Agriseeds brand CL614) that were germinated 
in the light on moist filter paper for 6 days. After roots were 
approximately 3-6 cm long, a single kernel /seedling was planted in 
a 591 ml clear plastic cup with 50 gm of dry Metromix\ Twenty 
neonate SCR or WCR were then placed directly on the roots of the 
seedling and covered with Metromix*. Upon infestation, the 
seedlings were then drenched with 50 ml total volume of a diluted 
broth solution. After drenching, the cups were sealed and left at 
room temperature in the light for 7 days. Afterwards, the 
seedlings were washed to remove all Metromix 1 * and the roots were 
excised and weighed. Activity was rated as the percentage of corn 
root remaining relative to the control plants and as leaf damage 
induced by feeding. Leaf damage was scored visually and rated as 
either + # ++, or + + with - representing no damage and +++ 
representing severe damage. 

Activity against neonate SCR in soil was tested as follows 
(Table 6) . The test was run using corn seedlings (United Agriseeds 
brand CL614) that were germinated in the light on moist filter 
paper for 6 days. After the roots were approximately 3-6 cm long, 
a single kernel /seedling was planted in a 591 ml clear plastic cup 
with 150 gm of soil from a field in Lebanon, IN planted the 
previous year with corn. This soil had not been previously treated 
with insecticides. Twenty neonate SCR were then placed directly on 
the roots of the seedling and covered with soil. After 
infestation, the seedlings were drenched with 50 ml total volume of 
a diluted broth solution. After drenching, the unsealed cups were 
incubated in a high relative humidity chamber (80%) at 78 °F. 
Afterwards, the seedlings were washed to remove all soil and the 
roots" v/ere excised and weighed. Activity was rated as the 
percentage of corn root remaining relative to the^orrtrol plants 
and as leaf damage induced by feeding. Leaf damage was scored 
visually and rated as either -, +, , or +++ , with - representing 
no damage and +++ representing severe damage. 
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Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on 

Rpptworm Larvae after Post -Infestation Drenching tMetrpngx*) 

Treatment Larvae Leaf Damage Root Weight (g) % 

Southern Corn Rootworm 





Water 






0 


.4916 


± 


0 


.023 


100 






Medium (2 . 0% v/v) 






0 


.4416 


± 


0 


.029 


100 




10 


Broth (6.25%v/v) 






0 


.4641 


± 


0 


. 081 


100 






Water 


+ 


+ + + 


0 


. 1410 


± 


0 


. 006 


28. 


7 




Media (2.0% v/v) 




++ + 


0 


. 1345 


+ 


0 


.028 


30. 


4 


15 


Broth (1.56% v/v) 


+ 




0 


.4830 


± 


0 


.031 


104 






Western Com Rootworm 




















Water 






0 


.4446 


± 


0 


.019 


100 






Broth (2.0% v/v) 






0 


.4069 




0 


. 026 


100 




20 
























Water 






0 


. 2202 


± 


0 


. 015 


49 






Broth (2.0% v/v) 


«+ 




0 


. 3879 


+ 


0 


. 013 


95 





25 Table 6 

Effect Qt PtiQtoriiabdus luminegceng (Strain w-14) Culture Broth <?n 
Southern Corn Rpptworm Larvae after Post-infestation punching 

(Soil) 

30 Treatment Larvae Leaf Damage Root Weight (g) % 

Water - - 0.2148 ± 0.014 100 

Broth (50% v/v) - - 0.2260 ± 0.016 103 

35 Water + + + + 0.0916 ± 0.009 43 

Broth (50% v/v) + - 0.2428 ± 0.032 113 

Activity of Photorhabdus luminescens (strain W-14) culture 
broth against second instar turf grubs in Metromix* was observed in 

4 0 tests conducted as follows (Table 7) . Approximately 50 gm of dry 
Metromix* was added to a 591 ml clear plastic cup. The Metromix* 
was then drenched with 50 ml total volume of a 50% (v/v) diluted 
Photorhabdus broth solution. The dilution of crude broth was made 
with water, with 50% broth being prepared by adding 25 ml of crude 

4 5 broth to 25 ml of water for 50 ml total volume. A 1% (w/v) 

solution of proteose peptone #3 ( PP3 ) , which is a 50% dilution of 
the normal media concentration, was used as a broth control. After 
drenching, five second instar turf grubs were placed on the top of 
the moistened Metromix". Healthy turf grub larvae burrowed rapidly 

50 into the Metromix". Those larvae that did not burrow within lh were 
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removed and replac^^with fresh larvae. The cups^^Kre sealed and 
placed in a 28 °C incubator, in the dark. After seven days, larvae 
were removed from the Metromix* and scored for mortality. Activity 
was rated the percentage of mortality relative to control . 

5 

Table 7 

Effect of Phnzorhabdus lumlnescens (Strain W-14) Culture Broth on 
jMirf Grub after Pre- Infestation Drenching (Metromix) 

10 Treatment Mortality* Mortality % 

Water 7/15 47 

Cont rol medium 
15 (1.0% w/v) 12/19 63 

Broth 

(50% v/v) 17/20 85 

2 0 ^expressed as a ratio of dead/ living larvae 

Example 4 

Insecticide utility upon Leaf Application 

25 

Activity of Photorhabdus broth against European corn borer was 
seen when the broth was applied directly to the surface of maize 
leaves (Table 8) . In these assays Photorhabdus broth was diluted 
100 -fold with culture medium and applied manually to the surface of 

30 excised maize leaves at a rate of about 6.0 fil/cm 2 of leaf surface. 
The leaves were air dried and cut into equal sized strips 
approximately 2x2 inches. The leaves were rolled, secured with 
paper clips and placed in 1 oz plastic shot glasses with 0.25 inch 
of 2% agar on the bottom surface to provide moisture. Twelve 

35 neonate European corn borers were then placed onto the rolled leaf 
and the cup was sealed. After incubation for 5 days at 27 °C in the 
dark, the samples were scored for feeding damage and recovered 
larvae . 
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Table a 

Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on 

European Com Borer Larvae Following Pre- Infestation Application to 

Excised Maize Leaves 

5 

Treatment 

Water 

Control Medium 
Broth (1.0% v/v) 

10 

Activity of the culture broth against neonate tobacco budworm 
(Heliothis virescens) was demonstrated using a leaf dip 
methodology. Fresh cotton leaves were excised from the plant and 
leaf disks were cut with an 18.5 mm cork-borer. The disks were 

15 individually emersed in control medium (PP3) or Photorhabdus 

luminescens (strain W-14) culture broth which had been concentrated 
approximately 10-fold using an Amicon (Beverly, MA) , Proflux M12 
tangential filtration system with a 10 kDa filter. Excess liquid 
was removed and a straightened paper clip was placed through the 

2 0 center of the disk. The paper clip was then wedged into a plastic, 
1.0 oz shot glass containing approximately 2.0 ml of 1% Agar. This 
served to suspend the leaf disk above the agar. Following drying 
of the leaf disk, a single neonate tobacco budworm larva was placed 
on the disk and the cup was capped. The cups were then sealed in a 

2 5 plastic bag and placed in a darkened, 27°C incubator for 5 days. 

At this time the remaining larvae and leaf material were weighed to 
establish a measure of leaf damage (Table 9) . 



Leaf Damage Larvae Recovered Weight (mg) 

Extensive 55/120 0.42 mg 

Extensive 40/120 0.50 mg 

Trace 3/120 0.15 mg 



Table 9 

30 Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on 
Tobacco Budworm Neonates in a Cotton-Leaf Dip Assay 

Final Weights (mg) 
Treatment Leaf Disk Larvae 

3 5 Control leaves 55.7 ± 1.3 na* 

Control Medium 34.0+2.9 4.3+0.91 

Photorhabdus broth 54.3 ± 1.4 0.0** 
* - not applicable, ** - no live larvae found 



4 0 Example 5, Pa rt A 

Characterisation of Toxi n .. P e pti d e Comp onen t s 

In a subsequent analysis, the toxin protein subunits of the 
bands isolated as in Example 1 were resolved on a 7% SDS 
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0.8 



10 



20 



25 



(acrylamide :BIS-acrylamide) . This gel matrix facilitates better 
resolution of the larger proteins. The gel system used to estimate 
the Band 1 and Band 2 subunit molecular weights in Example 1 was an 
18% gel with a ratio of 38:0.18 (acrylamide : BIS -acrylamide ) , which 
allowed for a broader range of size separation, but less resolution 
of higher molecular weight components. 

In this analysis, 10, rather than 8, protein bands were 
resolved. Table 10 reports the calculated molecular weights of the 
10 resolved bands, and directly compares the molecular weights 
estimated under these conditions to those of the prior example. It 
is not surprising that additional bands were detected under the 
different separation conditions used in this example. Variations 
between the prior and new estimates of molecular weight are also to 
be expected given the differences in analytical conditions. In the 
analysis of this example, it is thought that the higher molecular 
weight estimates are more accurate than in Example 1, as a result 
of improved resolution. However, these are estimates based on SDS 
PAGE analysis, which are typically not analytically precise and 
result in estimates of peptides and which may have been further 
altered due to post- and co- translational modifications. 

Amino acid sequences were determined for the N- terminal 
portions of five of the 10 resolved peptides. Table 10 

correlates the molecular weight of the proteins and the 
identified sequences. In SEQ ID NO:2, certain analyses suggest 
that the proline at residue 5 may be an asparagine (asn) . In SEQ 
ID NO: 3, certain analyses suggest that the amino acid residues at 
positions 13 and 14 are both arginine (arg) . In SEQ ID NO: 4, 
certain analyses suggest that the amino acid residue at position 6 
may be either alanine (ala) or serine (ser) . In SEQ ID NO:5, 
certain analyses suggest that the amino acid residue at position 3 
may be aspartic acid (asp) . 
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Ta ble »1Q 



5 



ESTIMATE NEW ESTIMATE* SEQ . LISTING 

208 200.2 kDa SEQ ID NO : 1 

184 175.0 kDa SEQ ID NO: 2 

65.6 68.1 kDa SEQ ID NO: 3 

60.8 65.1 kDa SEQ ID NO: 4 

56.2 58.3 kDa SEQ ID NO: 5 

25.1 23.2 kDa SEQ ID NO: 15 



10 



♦New estimates are based on SDS PAGE and are not based on 
gene sequences. SDS PAGE is not analytically precise. 



Example 5. Parr Ft 
Characterization of Toxi n Peptide Components 



15 



New N- terminal sequence, SEQ ID NO:15 f Ala Gin Asp Gly Asn Gin 
Asp Thr Phe Phe Ser Gly Asn Thr, was obtained by further N- terminal 
sequencing of peptides isolated from Native HPLC-purif ied toxin as 
described in Example 5, Part A, above. This peptide comes from the 
20 tcaA gene. The peptide labeled TcaAii, starts at position 254 and 
goes to position 491 , where the TcaAni peptide starts, SEQ ID 
NO:4. The estimated size of the peptide based on the gene sequence 
is 25,240 Da. 

25 Example 6 

Characterization of Toxin Pept ide Components 

In yet another analysis, the toxin protein complex was re- 
isolated from the Photorhabdus luminescens growth medium (after 

30 culture without Tween) by performing a 10% - 80% ammonium sulfate 

precipitation followed by an ion exchange chromatography step (Mono 
Q) and two molecular sizing chromatography steps. These conditions 
were like those used in Example 1. During the first molecular 
sizing step, a second biologically active peak was found at about 

35 100 ± 10 kDa. Based upon protein measurements, this fraction was 
20 - 50 fold less active than the larger, or primary, active peak 
of about 860 ± 100 kDa (native) . During this isolacion experiment 
a smaller active peak of about 325 ± 50 kDa that retained a 
considerable portion of the starting biological activity was also 

4 0 resolved. It is thought that the 325 kDa peak is related to or 
derived from the 8 60 kDa peak. 

A 56 kDa protein was resolved in this analysis. The N- 
termmal sequence of this protein is presented in SEQ ID NO: 6. It 
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is noteworthy that 




is protein shares significant 




[entity and 



10 



15 



20 



25 



30 



35 



conservation with SEQ ID NO; 5 at the N-termirius, suggesting that 
the two may be encoded by separate members of a gene family and 
that the proteins produced by each gene are sufficiently similar to 
both be operable in the insecticidal toxin complex. 

A second, prominent 185 kDa protein was consistently present 
in amounts comparable to that of protein 3 from Table 10, and may 
be the same protein or protein fragment. The N-terminal sequence 
of this 185 kDa protein is shown at SEQ ID NO: 7. 

Additional N- terminal amino acid sequence data were also 
obtained from isolated proteins. None of the determined N-terminal 
sequences appear identical to a protein identified in Table 10. 
Other proteins were present in isolated preparation. One such 
protein has ah estimated molecular weight of 108 kDa and an N- 
terrninal sequence as shown in SEQ ID NO: 8. A second such protein 
has an estimated molecular weight of 80 kDa and an N-terminal 
sequence as shown in SEQ ID NO: 9. 

When the protein material in the approximately 32 5 kDa active 
peak was analyzed by size, bands of approximately 51, 31, 28, and 
22 kDa were observed. As in all cases in which a molecular weight 
was determined by analysis of electrophoretic mobility, these 
molecular weights were subject to error effects introduced by 
buffer ionic strength differences, electrophoresis power 
differences, and the like. One of ordinary skill would understand 
that definitive molecular weight values cannot be determined using 
these standard methods and that each was subject to variation. It 
was hypothesized that proteins of these sizes are degradation 
products of the larger protein species (of approximately 200 kDa 
size) that were observed in the larger primary toxin complex. 

Finally, several preparations included a protein having the N- 
terminal sequence shown in SEQ ID NO: 10. This sequence was 
strongly homologous to known chaperonin proteins, accessory 
proteins known to function in the assembly of large protein 
complexes. Although the applicants could not ascribe such an 
assembly function to the protein identified in SEQ ID NO: 10, it was 
consistent with the existence of the described toxin protein 
complex that such a chaperonin protein could be involved in its 
assembly. Moreover, although such proteins have not directly been 
suggested to have toxic activity, this protein may be important to 
determining the overall structural nature of the protein toxin, and 
thus, may contribute to the toxic activity or durability of the 
complex in vivo after oral delivery. 
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Subsequent analysis of the stability of the protein toxin 
complex to proteinase K was undertaken. It was determined that 
after 24 hour incubation of the complex in the presence of a 10- 
fold molar excess of proteinase K, activity was virtually 
5 eliminated (mortality on oral application dropped to about 5%) 
These data confirm the proteinaceous nature of the toxin. 

The toxic activity was also retained by a dialysis membrane, 
again confirming the large size of the native toxin complex. 



and 6, ammonium sulfate precipitation of PhotorhaJbdus proteins was 
performed by adjusting Photorhabdus broth, typically 2-3 liters, to 
a final concentration of either 10% or 20% by the slow addition of 
ammonium sulfate crystals. After stirring for 1 hour at 4°C / the 
20 material was centrifuged at 12,000 x g for 30 minutes. The 

supernatant was adjusted to 80% ammonium sulfate, stirred at 4°C 
for 1 hour, and centrifuged at 12,000 x g for 60 minutes. The 
pellet was resuspended in one -tenth the volume of 10 mM Na 2 " P0 4 , pH 

7.0 and dialyzed against the same phosphate buffer overnight at 
25 4°C. The dialyzed material was centrifuged at 12,000 x g for 1 
hour prior to ion exchange chromatography. 

A HR 16/50 Q Sepharose (Pharmacia) anion exchange column was 
equilibrated with 10 mM Na 2 ' P0 4 , pH 7.0. Centrifuged, dialyzed 

ammonium sulfate pellet was applied to the Q Sepharose column at a 
30 rate of 1.5 ml/min and washed extensively at 3.0 ml/min with 

equilibration buffer until the optical density (O.D. 280) reached 
less than 0.100. Next, either a 60 minute NaCl gradient ranging 
from 0 to 0.5 M at 3 ml/min, or a series of step elutions using 0.1 
M, 0.4 M and finally 1.0 NaCl for 60 minutes each was applied to 
35 the column. Fractions were pooled and concentrated using a 

Centriprep 100. Alternatively, proteins could be eluted by a 
single 0.4 M NaCl wash without prior elution with 0.1 M NaCl. 

Two milliliter aliquots of concentrated 0 Sepharose samples 
were loaded at 0.5 ml/min onto a HR 16/50 Superose 12 (Pharmacia) 
4 0 gel filtration column equilibrated with 10 mM Nao'P0 4 , pH 7.0. The 

column was washed with the same buffer for 240 mm at 0.5 ml/min 
and 2 min samples were collected. The void volume material was 



10 



Example 7 

Isolation, Characterization and Partial Amino Acid 
Sequencing of Photorhabdus Toxins 



15 



Isolation and N-Terminal Amino Ac id Sequencing 

In a set of experiments conducted in parallel to Examples 5 
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collected and concentrated using a Centriprep 10^^ Two milliliter 
aliquot s of concentrated Superose 12 samples were loaded at 0.5 
ml/min onto a HR 16/50 Sepharose 4B-CL (Pharmacia) gel filtration 
column equilibrated with 10 mM Na 2 *P0 4 , pH 7.0. The column was 

washed with the same buffer for 240 min at 0.5 ml/min and 2 min 
samples were collected. 

The excluded protein peak was subjected to a second 
fractionation by application to a gel filtration column that used a 
Sepharose CL-4B resin, which separates proteins ranging from about 
30 kDa to 1000 kDa . This fraction was resolved into two peaks; a 
minor peak at the void volume (>1000 kDa) and a major peak which 
eluted at an apparent molecular weight of about 860 kDa. Over a 
one week period subsequent samples subjected to gel filtration 
showed the gradual appearance of a third peak (approximately 325 
kDa) that seemed to arise from the major peak, perhaps by limited 
proteolysis. Bioassays performed on the three peaks showed that 
the void peak had no activity, while the 860 kDa toxin complex 
fraction was highly active, and the 325 kDa peak was less active, 
although quite potent. SDS PAGE analysis of Sepharose CL-4B toxin 
complex peaks from different fermentation productions revealed two 
distinct peptide patterns, denoted "P" and "S n . The two patterns 
had marked differences in the molecular weights and concentrations 
of peptide components in their fractions. The "S" pattern, 
produced most frequently, had 4 high molecular weight peptides 
(> 150 kDa) while the "P" pattern had 3 high molecular weight 
peptides. In addition, the M S" peptide fraction was found to have 

2- 3 fold more activity against European Corn Borer. This shift may 
be related to variations in protein expression due to age of 
inoculum and/or other factors based on growth parameters of aged 
cultures . 

Milligram quantities of peak toxin complex fractions 
determined to be "P" or 11 S M peptide patterns were subjected to 
preparative SDS PAGE, and transblotted with TRIS-glycine 
(Seprabuff™ to PVDF membranes (ProBlott™, Applied Biosys terns) for 

3- 4 hours. Blots were sent for amino acid analysis and N- terminal 
amino acid sequencing at Harvard MicroChem and Cambridge ProChem, 
respectively.' Three peptides in the "S" pattern had unique N- 
terminal amino acid sequences compared to the sequences identified 
in the previous example. A 201 kDa (TcdAii) peptide set forth as 
SEQ ID NO: 13 below shared between 33% amino acid identity and 50% 
similarity (similarity and identity were calculated by hand) with 
SEQ ID NO:l (TcbAii) (in Table 10 vertical lines denote ammo acid 
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identities a^B^colons indicate conservative amino acid 
substitutions) . A second peptide of 197 kDa, SEQ ID NO: 14 (TcdB) , 
had 42% identity and 58% similarity with SEQ ID NO: 2 (TcaC) 
(similarity and identity were calculated by hand) . Yet a third 
5 peptide of 205 kDa was denoted TcdAii . In addition, a limited N- 
terminal amino acid sequence, SEQ ID NO: 16 (TcbA) , of a peptide of 
at least 235 kDa was identical with the amino acid sequence, SEQ ID 
NO:12, deduced from a cloned gene ( tcJbA) , SEQ ID NO:ll f containing 
a deduced amino acid sequence corresponding to SEQ ID NO:l 

10 (TcbAii) . This indicates that the larger 23 5+ kDa peptide was 

proteolytically processed to the 201 kDa peptide, (TcbAii) « (SEQ ID 
NO:l) during fermentation, possibly resulting in activation of the 
molecule. In yet another sequence, the sequence originally 
reported as SEQ ID NO: 5 (TcaBii) reported in Example 5 above, was 

15 found to contain an aspartic acid residue (Asp) at the third 

position rather than glycine (Gly) and two additional ammo acids 
Gly and Asp at the eighth and ninth positions, respectively. In 
yet two other sequences, SEQ ID NO: 2 (TcaC) and SEQ ID NO: 3 
(TcaB i ), additional amino acid sequence was obtained. 

20 Densitometric quantitation was performed using a sample that was 
identical to the "S" preparation sent for N-terminal analysis. 
This analysis showed that the 201 kDa and 197 kDa peptides 
represent 7.0% and 7.2%, respectively, of the total Coomassie 
brillant blue stained protein in the "S" pattern and are present in 

2 5 amounts similar to the other abundant peptides. It was speculated 
that these peptides may represent protein homo log s , analogous to 
the situation found with other bacterial toxins, such as various 
Cryl Bt toxins. These proteins vary from 40-90% similarity at their 
N-terminal amino acid sequence, which encompasses the toxic 

30 fragment. 

Internal Amino Acid Sequencing 

To facilitate cloning of toxin peptide genes, internal amino 
acid sequences of selected peptides were obtained as followed. 
35 Milligram quantities of peak 2A fractions determined to be "P" or 
"S" peptide patterns were subjected to preparative SDS PAGE, and 
transblotted with TRIS-glycine (Seprabuff™ to PVDF membranes 
(ProBlott™, Applied Biosystems) for 3-4 hours. Blots were sent for 
amino acid analysis and N-terminal amino acid sequencing at Harvard 
4 0 MicroChem and Cambridge ProChem, respectively. Three peptides, 

referred to as TcbAii (containing SEQ ID NO:l), TcdAii, and TcaBi 
(containing SEQ ID NO: 3) were subjected to trypsin digestion by 
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Harvard MicroChem f^^owed by HPLC qhromatography separate 
individual peptides. N-terminal amino acid analysis was performed 
on selected tryptic peptide fragments. Two internal peptides were 
sequenced for the peptide TcdAii (205 kDa peptide) referred to as 
5 TcdAii-PTlll (SEQ ID NO:17) and TcdAii~PT79 (SEQ ID NO: 18). Two 
internal peptides were sequenced for the peptide TcaBi (68 kDa 
peptide) referred to as TcaBi -PT158 (SEQ ID NO: 19) and TcaBi -PT108 
(SEQ ID NO: 20) . Four internal peptides were sequenced for the 
peptide TcbAii (201 kDa peptide) referred to as TcbAii -PT103 (SEQ 
10 IDN0:21), TcbAii-PT56 (SEQ IDNO:22), TcbAii-PT81 (a) (SEQ ID 
NO:23), and TcbAii -PTB1 (b) (SEQ ID NO: 24). 

Table 11 

N-Terminal Amino Acid Sequences 
15 (similarity and identity were calculated by hand) 

201 kDa (33% identity & 50% similarity to SEQ ID NO.l) 
LIGYNNQFSG*A SEQ ID NO: 13 

: II I = I 

20 FIQGYSDLFGN-A SEQ ID NO : 1 

197 kDa (42% identity & 58% similarity SEQ ID NO . 2 ) 
MQNSQTFSVGEL SEQ ID NO. 14 

I I : I I = I 

25 MQDSPEVSITTL SEQ ID NO . 2 

Example 8 

m ^tnirtiop of a Co s mid Library of Photorhabdus UmlnZ&S^OZ W-14 
30 r^nnrm r TOffi *nri it s Screening to Isolate Genes F.nCQdjng Peptides 

misprising the Toxic Protein Preparation 

As a prerequisite for the production of Photorhabdus insect 
toxic proteins in heterologous hosts, and for other uses, it is 

35 necessary to isolate and characterize the genes that encode those 
peptides. This objective was pursued in parallel. One approach, 
described later, was based on the use of monoclonal and polyclonal 
antibodies raised against the purified toxin which were then used 
to isolate clones from an expression library. The other approach, 

4 0 described in this example, is based on the use of the N-terminal 
and internal amino acid sequence data to design degenerate 
oligonucleotides for use in PCR amplication. Either method can be 
used to identify DNA clones that contain the peptide-encoding genes 
so as to permit the isolation of the respective genes, and the 

4 5 determination of their DNA base sequence. 
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Genomic PNA at ion 

Photorhabdus lumlnescens strain W-14 (ATCC accession number 
55397) was grown on 2% proteose peptone #3 agar {Difco 
Laboratories, Detroit, MI) and insect icidal toxin competence was 
5 maintained by repeated bioassay after passage, using the method 
described in Example 1 above. A 50 ml shake culture was produced 
in a 175 ml baffled flask in 2% proteose peptone #3 medium, grown 
at 28°C and 150 rpm for approximately 24 hours. 15 ml of this 
culture was pelleted and frozen in its medium at -20°C until it was 

10 thawed for DNA isolation. The thawed culture was centrifuged, (700 
x g, 3 0 min) and the floating orange mucopolysaccharide material 
was removed. The remaining cell material was centrifuged (25,000 x 
g, 15 min) to pellet the bacterial cells, and the medium was 
removed and discarded. 

15 Genomic DNA was isolated by an adaptation of the CTAB method 

described in section 2.4.1 of Current Protocols in Molecular 
Biology (Ausubel et al . eds , John Wiley & Sons, 1994) [modified to 
include a salt shock and with all volumes increased 10 -fold] . The 
pelleted bacterial cells were resuspended in TE buffer (10 mM Tris- 

20 HC1, 1 mM EDTA, pH 8 . 0 ) to a final volume of 10 ml, then 12 ml of 5 
M NaCl was added; this mixture was centrifuged 20 min at 15,000 x 
g. The pellet was resuspended in 5 . 7 ml TE and 300 ml of 10% SDS 
and 60 ml of 20 mg/ml proteinase K (Gibco BRL Products, Grand 
Island, NY,- in sterile distilled water) were added to the 

2 5 suspension. This mixture was incubated at 3 7°C for 1 hr; then 
approximately 10 mg lysozyme (Worthington Biochemical Corp., 
Freehold, NJ) was added. After an additional 45 min, 1 ml of 5 M 
NaCl and 800 ml of CTAB/ NaCl solution (10% w/v CTAB, 0.7 M NaCl) 
were added. This preparation was incubated 10 min at 65°C, then 

30 gently agitated and further incubated and agitated for 

approximately 20 min to assist clearing of the cellular material. 
An equal volume of chlorof orm/isoamyl alcohol solution (24:1, v/v) 
was added, mixed gently and centrifuged. After two extractions 
with an equal volume of PCI (phenol /chlorof orm/isoamyl alcohol; 

35 50:49:1, v/v/v; equilibrated with 1 M Tris-HCl, pH 8.0; 

Intermountain Scientific Corporation, Kaysville, UT) , the DNA was 
precipitated with 0.6 volume of isopropanol . The DNA precipitate 
was gently removed with a glass rod, washed twice with 70% ethanol, 
dried, and dissolved in 2 ml STE (10 mM Tris-HCl pH 8.0, 10 mM 

40 NaCl, 1 mM EDTA) . This preparation contained 2.5 mg/ml DNA, as 
determined by optical density at 260 nm (i.e., OD 260 ) . 
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evaluated for suitability for library construction. CHEF gel 
analysis was performed in 1.5% agarose (Seakem* LE , FMC BioProducts, 
Rockland, ME) gels with 0.5 X TBE buffer (44.5 mM Tris-HCl pH 8.0, 
44.5 mM H3BO3, 1 mM EDTA) on a BioRad CHEF-DR II apparatus with a 
Pulsewave 760 Switcher (Bio-Rad Laboratories, Inc., Richmond, CA) . 
The running parameters were: initial A time, 3 sec; final A time, 
12 sec; 200 volts; running temperature, 4-18°C; run time, 16.5 hr . 
Ethidium bromide staining and examination of the gel under 
ultraviolet light indicated the DNA ranged from 30-250 kbp in size. 

Construction of Library 

A partial Sau3A 1 digest was made of this Photorhabdus genomic 
DNA preparation. The method was based on section 3.1.3 of Ausubel 
{supra.) . Adaptions included running smaller scale reactions under 
various conditions until nearly optimal results were achieved. 
Several scaled-up large reactions with varied conditions were run, 
the results analyzed on CHEF gels, and only the best large scale 
preparation was carried forward. In the optimal case, 200 )ag of 
Photorhabdus genomic DNA was incubated with 1.5 units of Sau3A 1 
(New England Biolabs , "NEB" , Beverly, MA) for 15 min at 37°C in 2 
ml total volume of IX NEB 4 buffer (supplied as 10X by the 
manufacturer) . The reaction was stopped by adding 2 ml of PCI and 
centrifuging at 8000 x g for 10 min. To the supernatant were added 
200 jil of 5 M NaCl plus 6 ml of ice-cold ethanol . This preparation 
was chilled for 30 min at -20°C, then centrifuged at 12,000 x g for 
15 min. The supernatant was removed and the precipitate was dried 
in a vacuum oven at 4 0°C, then resuspended in 400 p,l STE . 
Spec tr.QEhotome trie assay indicated about 4 0% recovery of the input 
DNA. The digested DNA was size fractionated on a_ sucrose gradient 
according to section 5.3.2 of CPMB (op. cit.). A 10% to 40% (w/v) 
linear sucrose gradient was prepared with a gradient maker in 
Ultra-Clear™ tubes (Beckman Instruments, Inc., Palo Alto, CA) and 
the DNA sample was layered on top. After centrif ugat ion , (26,000 
rpm, 17 hr, Beckman SW41 rotor, 20°C) , fractions {about 750 ul ) 
were drawn from the top of the gradient and analyzed by CHEF gel 
electrophoresis (as described earlier) . Fractions containing Sau3A 
1 fragments in the size range 20-40 kbp were selected and DNA was 
precipitated by a modification (amounts of all solutions increased 
approximately 6.3-fold) of the method in section 5.3.3 of Ausubel 
{supra.). After overnight precipitation, the DNA was collected by 
centrif ugation (17,000 x g, 15 min), dried, redissolved in TE, 
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pooled into^P^inal volume of 80 fil,. and reprecipi tated with the 
addition of 8 fil 3 M sodium acetate and 220 \il ethanol. The pellet 
collected by centrif ugat ion as above was resuspended in 12 |il TE. 
Concentration of the DNA was determined by Hoechst 33258 dye 
5 (Polysciences, Inc., Warrington, PA) fluorometry in a Hoefer TKO100 
fluorimeter (Hoefer Scientific Instruments, San Francisco, CA) . 
Approximately 2.5 fig of the size- fractionated DNA was recovered. 

Thirty >ag of cosmid pWE15 DNA (Stratagene, La Jolla, CA) was 
digested to completion with 100 units of restriction enzyme BamH 1 

10 (NEB) in the manufacturer's buffer (final volume of 200 fil , 37°C, 1 
hr) . The reaction was extracted with 100 \il of PCI and DNA was 
precipitated from the aqueous phase by addition of 20 \xl 3M sodium 
acetate and 550 ^1 -20°C absolute ethanol. After 20 min at -70°C, 
the DNA was collected by centrif ugation (17,000 x g, 15 min), dried 

15 under vacuum, and dissolved in 180 nl of 10 mM Tris-HCl, pH 8.0. 
To this were added 20 fil of 10X CIP buffer (100 mM Tris-HCl, pH 
8.3; 10 mM ZnCl 2 ; 10 mM MgCl 2 ), and 1 jil (0.25 units) of 1:4 diluted 
calf intestinal alkaline phosphatase (Boehringer Mannheim 
Corporation, Indianapolis, IN) . After 30 min at 37°C, the 

20 following additions were made: 2 (il 0.5 M EDTA, pH 8.0; 10 p.1 10% 
SDS; 0.5 jil of 20 mg/ml proteinase K (as above), followed by 
incubation at 55°C for 30 min. Following sequential extractions 
with 100 fil of PCI and 100 fal phenol ( Intermountain Scientific 
Corporation, equilibrated with 1 M Tris-HCl, pH 8.0), the 

2 5 dephosphorylated DNA was precipitated by addition of 72 fil of 7.5 M 
ammonium acetate and 550 \xl -20°C ethanol, incubation on ice for 3 0 
min, and centrif ugation as above. The pelleted DNA was washed once 
with 500 jil -20 °C 70% ethanol, dried under vacuum, and dissolved in 
20 nl of TE buffer. 

30 Ligation of the size- fractionated Sau3A 1 fragments to the 

BamH 1 -digested and phosphatased pWE15 vector was accomplished 
using T4 ligase (NEB) by a modification (i.e., use of premixed 10X 
ligation buffer supplied by the manufacturer) of the protocol in 
section 3.33 of Ausubel . Ligation was carried out overnight in a 

35 total volume of 20 nl at 15°C, followed by storage at - 20°C. 

Four nl of the cosmid DNA ligation reaction, containing about 
1 \xg of DNA, was packaged into bacteriophage lambda using a 
commercial packaging extract (Gigapack* III Gold Packaging Extract, 
Stratagene), following the manufacturer's directions. The packaged 

4 0 preparation was stored at 4°C until use. The packaged cosmid 

preparation was used to infect Escherichia coli XL1 Blue MR cells 
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("Titering the Cosmid Library' 1 ), as follows. XL1 Blue MR cells 
were grown in LB medium (g/L: Bacto- tryptone , 10; Bacto-yeast 
extract, 5; Bacto-agar, 15; NaCl , 5; [Difco Laboratories, Detroit, 
MI]) containing 0.2% (w/v) maltose plus 10 mM MgS0 4 , at 37°C. After 
5 hr growth, cells were pelleted at 700 x g (15 min) and 
resuspended in 6 ml of 10 mM MgS0« . The culture density was 
adjusted with 10 mM MgSO« to OD 600 = 0.5. The packaged cosmid 
library was diluted 1:10 or 1:20 with sterile SM medium (0.1 M 
NaCl, 10 mM MgSCV 50 mM Tris-HCl pH 7.5, 0.01% w/v gelatin), and 25 
yil of the diluted preparation was mixed with 25 jxl of the diluted 
XL1 Blue MR cells. The mixture was incubated at 25°C for 30 min 
(without shaking) , then 200 \il of LB broth was added, and 
incubation was continued for approximately 1 hr with occasional 
gentle shaking. Aliquots (20-40 of this culture were spread on 

LB agar plates containing 100 mg/1 ampicillin (i.e., LB-Amp 100 ) and 
incubated overnight at 37 °C. To store the library without 
amplification, single colonies were picked and inoculated into 
individual wells of sterile 96-well microwell plates; each well 
containing 75 jal of Terrific Broth (TB media: 12 g/1 Bacto- 
tryptone, 24 g/1 Bacto-yeast extract, 0.4% v/v glycerol, 17 mM 
KHoPO,, 72 mM K 2 HPO« ) plus 100 mg/1 ampicillin (i.e., TB-Amp 100 ) and 
incubated (without shaking) overnight at 37°C. After replicating 
the 96-well plate into a copy plate, 75 pl/well of filter- 
sterilized TB:glycerol (1:1, v/v; with, or without, 100 mg/1 
ampicillin) was added to the plate, it was shaken briefly at 100 
rpm, 37°C, and then closed with ParafilnT (American National Can, 
Greenwich, CT) and placed in a -70°C freezer for storage. Copy 
plates were grown and processed identically to the master plates. 
A total of 40 such master plates {and their copies) were prepared. 

Screenin g of the Library with Radiolabeled DNA Probes 

To prepare colony filters for probing with radioact ively 
labeled probes, ten 96-well plates of the library were thawed at 
25 °C (bench top at room temperature) . A replica plating tool with 
96 prongs was used to inoculate a fresh 96-well copy plate 
containing 75 jil/well of TB-Amp 100 . The copy plate was grown 
overnight (stationary) at 37°C, then shaken about 30 min at 100 rpm 
at 37°C. A total of 800 colonies was represented in these copy 
plates, due to nongrowth of some isolates. The replica tool was 
used to inoculate duplicate impressions of the 96-well arrays onto 
Magna NT (MSI, Westboro, MA) nylon membranes (0.45 micron, 220 x 
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250 mm) whi 




.ad been placed on solid LB- Amp 



! 100 



(100 ml/dish) in 



Bio-assay plastic dishes (Nunc, 243 x 243 x 18 mm; Curtin Mathison 
Scientific, Inc., Wood Dale, IL) . The colonies were grown on the 
membranes at 37°C for about 3 hr. 



sequence insert, see below) was grown on a separate Magna NT 
membrane (Nunc, 0.45 micron, 82 mm circle) on LB medium 
supplemented with 35 mg/1 chloramphenicol (i.e., LB-Cam 35 ) , and 
processed alongside the library colony membranes. Bacterial 

10 colonies on the membranes were lysed, and the DNA was denatured and 
neutralized according to a protocol taken from the Genius™ System 
User's Guide version 2.0 (Boehringer Mannheim, Indianapolis, IN) . 
Membranes were placed colony side up on filter paper soaked with 
0.5 N NaOH plus 1.5 M NaCl for 15 min to denature, and neutralized 

15 on filter paper soaked with 1 M Tris-HCl pH 8 . 0 , 1.5 M NaCl for 15 
min. After UV- cross linking using a Stratagene UV Stratalinker set 
on auto crosslink, the membranes were stored dry at 25°C until use. 
Membranes were trimmed into strips containing the duplicate 
impressions of a single 96 -well plate, then washed extensively by 

20 the method of section 6.4.1 in CPMB (op. cit.): 3 hr at 25°C in 3X 
SSC, 0.1% (w/v) SDS, followed by 1 hr at 65°C in the same solution, 
then rinsed in 2X SSC in preparation for the hybridization step 
(2 OX SSC = 3 M NaCl, 0.3 M sodium citrate, pH 7.0) . 

2 5 Amplification of a Specific Genomic Fr agment of a ToaC Gene 

Based on the N- terminal amino acid sequence determined for the 
purified TcaC peptide fraction [disclosed herein as SEQ ID NO: 2] , a 
pool of degenerate oligonucleotides (pool S4Psh) was synthesized by 
standard p-cyanoethyl chemistry on an Applied BioSystem ABI394 
30 DNATrI^A Synthesizer (Perkin Elmer, Foster City, CA) . The 

oligonucleotides were deprotected 8 hours at 55 °C; "dissolved in 
water, quantitated by spectrophotometric measurement, and diluted 
for use. This pool corresponds to the determined N- terminal amino 
acid sequence of the TcaC peptide. The determined amino acid 

3 5 sequence and the corresponding degenerate DNA sequence are given 

below, where A, C, G f and T are the standard DNA bases, and I 
represents inosine : 

Amino Met Gin Asp Ser Pro Glu Val 



S4Psh 5 l ATG CA(A/G) GA(T/C) (T/A) (C/G) (T/A) CCI GA ( A/G) GT 3 ' 

Another set of degenerate oligonucleotides was synthesized 
(pool P2.3.5R), representing the complement of the coding strand 
4 5 for the determined amino acid sequence of the SEQ ID NO: 17: 
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Amino 

Acid Ala Phe Asn He Asp Asp Val 

Codons 5' GCN TT(T/C) AA(T/C) AT(A/T/C) GA(T/C) GA(T/C) GT 3' 

5 P2.3.5R 3 'CG(A/C/G/T) AA (A/G) TT ( A/G) TA (T/A/G) CT (A/G) CT(A/G) CA 5' 

These oligonucleotides were used as primers in Polymerase 
Chain Reactions (PCR S , Roche Molecular Systems, Branchburg, NJ) to 
amplify a specific DNA fragment from genomic DNA prepared from 

10 Photorhabdus strain W-14 (see above) . A typical reaction (50 jil ) 

contained 125 pmol of each primer pool P2Psh and P2.3.5R, 253 ng of 
genomic template DNA, 10 nmol each of dATP , dCTP, dGTP , and dTTP f 
IX GeneAmp* PCR buffer, and 2.5 units of AmpliTaq" DNA polymerase 
(both from Roche Molecular Systems; 10X GeneAmp* buffer is 100 mM 

15 Tris-HCl pH 8.3, 500 mM KC1 , 0.01% w/v gelatin). Amplifications 
were performed in a Perkin Elmer Cetus DNA Thermal Cycler (Perkin 
Elmer, Foster City, CA) using 35 cycles of 94°C (1.0 min) , 55°C 
(2.0 min), 72°C (3.0 min), followed by an extension period of 7.0 
min at 72 °C. Amplification products were analyzed by 

20 electrophoresis through 2% w/v NuSieve* 3:1 agarose (FMC 

BioProducts) in TEA buffer (40 mM Tris -acetate , 2 mM EDTA, pH 8.0). 
A specific product of estimated size 250 bp was observed amongst 
numerous other amplification products by ethidium bromide (0.5 
jig/ml) staining of the gel and examination under ultraviolet light. 

2 5 The region of the gel containing an approximately 250 bp 

product was excised, and a small plug (0.5 mm dia . ) was removed and 
used to supply template for PCR amplification (40 cycles) . The 
reaction (50 |il) contained the same components as above, minus 
genomic template DNA. Following amplification, the ends of the 

3 0 fragments were made blunt and were phosphorylated by incubation at 

25°C for 20 min with 1 unit of T4 DNA polymerase (NEB) , 1 nmol ATP, 
and 2.15 units of T4 kinase (Pharmacia Biotech Inc., Piscataway, 
NJ) . 

DNA fragments were separated from residual primers by 
3 5 electrophoresis through 1% w/v GTG* agarose (FMC) in TEA. A gel 

slice containing fragments of apparent size 250 bp was excised, and 
the DNA was extracted using a Qiaex kit (Qiagen Inc., Chatsworth, 
CA) . 

The extracted DNA fragments were ligated to plasmid vector pBC 
40 KS< + ) (Stratagene) that had been digested to completion with 

restriction enzyme Sma 1 and extracted in a manner similar to that 
described for pWE15 DNA above. A typical ligation reaction (16.3 
\xl) contained 100 ng of digested pBC KS{ + ) DNA, 70 ng of 250 bp 
fragment DNA, 1 nmol [Co (NH 3 ) J Cl 3 , and 3.9 Weiss units of T4 DNA 
45 ligase (Collaborative Biomedical Products, Bedford, MA) , in IX 
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ligation bu^^ (50 mM Tris-HCl, pH .7.4; 10 mM MgCl 2 ; 10 mM 
dithiothreitol ; 1 mM spermidine, 1 mM ATP , 100 mg/ml bovine serum 
albumin) . Following overnight incubation at 14 °C, the ligated 
products were transformed into frozen, competent Escherichia coli 
5 DH5a cells (Gibco BRL) according to the suppliers* recommendations, 
and plated on LB-Cam 3S plates , containing IPTG (119 ng/ml) and X-gal 
(50 fig/ml) . Independent white colonies were picked, and plasmid 
DNA was prepared by a modified alkaline-lysis/PEG precipitation 
method (PRISM™ Ready Reaction DyeDeoxy™ Terminator Cycle 

10 Sequencing Kit Protocols; ABI/Perkin Elmer) . The nucleotide 

sequence of both strands of the insert DNA was determined, using T7 
primers [pBC KS( + ) bases 601-623: TAAAACGACGGCCAGTGAGCGCG ) and LacZ 
primers [pBC KS( + ) bases 7 92-816: ATGACCATGATTACGCCAAGCGCGC ) and 
protocols supplied with the PRISM™ sequencing kit (ABI/Perkin 

15 Elmer). Nonincorporated dye- terminator dideoxyribonucleotides were 
removed by passage through Centri-Sep 100 columns (Princeton 
Separations, Inc., Adelphia, NJ) according to the manufacturer's 
instructions. The DNA sequence was obtained by analysis of the 
samples on an ABI Model 373A DNA Sequencer (ABI/Perkin Elmer) . The 

2 0 DNA sequences of two isolates, GZ4 and HB14 , were found to be as 
illustrated in Fig. 1. 

This sequence illustrates the following features: 1) bases 1- 
20 represent one of the 64 possible sequences of the S4Psh 
degenerate oligonucleotides, ii) the sequence of amino acids 1-3 

25 and 6-12 correspond exactly to that determined for the N- terminus 
of TcaC (disclosed as SEQ ID NO:2), iii) the fourth amino acid 
encoded is a cysteine residue rather than serine. This difference 
is encoded within the degeneracy for the serine codons (see above), 
iv) the fifth amino acid encoded is proline, corresponding to the 

30 TcaC N-terminal sequence given as SEQ ID NO;2, v) bases 257-276 
encode one of the 192 possible sequences designed into the 
degenerate pool, vi ) the TGA termination codon introduced at bases 
268-270 is the result of complementarity to the degeneracy built 
into the oligonucleotide pool at the corresponding position, and 

35 does not indicate a shortened reading frame for the corresponding 
gene . 

Labeling q£ a TcaC Peptide Gene -specif jc Prgfre 

DNA fragments corresponding to the above 2 76 bases were 
40 amplified (35 cycles) by PCR C in a 100 |al reaction volume, using 100 
pmol each of P2Psh and P2.3.5R primers, 10 ng of plasmids G24 or 
HB14 as templates, 2 0 nmol each of dATP, dCTP, dGTP, and dTTP , 5 
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units of AmpliTAq*^^ polymerase, and IX concentWion of GeneAmp 
buffer, under the same temperature regimes as described above. The 
amplification products were extracted from a 1% GTG* agarose gel by 
Qiaex kit and quantitated by fluorometry. 
5 The extracted amplification products from plasmid HB14 

template (approximately 400 ng) were split into five aliquots and 
labeled with 32 P-dCTP using the High Prime Labeling Mix (Boehringer 
Mannheim) according to the manufacturer's instructions. 
Nonincorporated radioisotope was removed by passage through NucTrap 
10 Probe Purification Columns (Stratagene) , according to the 

supplier's instructions. The specific activity of the labeled DNA 
product was determined by scintillation counting to be 3.11 x 10 8 
dpm/ug. This labeled DNA was used to probe membranes prepared from 
800 members of the genomic library. 

15 

.grrppnina with a T caC-peot ide Gene Specific Probe 

The radiolabeled HB14 probe was boiled approximately 10 min, 
then added to "minimal hyb" solution. [Note: The "minimal hyb M 
method is taken from a CERES protocol; "Restriction Fragment Length 

20 Polymorphism Laboratory Manual version 4.0", sections 4-40 and 4- 
47; CERES/NPI, Salt Lake City, UT . NPI is now defunct, with its 
successors operating as Linkage Genetics] . "Minimal hyb" solution 
contains 10% w/v PEG (polyethylene glycol, M.W. approx . 8000), 7% 
w/v SDS; 0.6X SSC, 10 mM sodium phosphate buffer (from a 1M stock 

25 containing 95 g/1 NaH 2 P0 4 'lH 2 0 and 84.5 g/1 Na 2 HP0 4 " 7H 2 0) , 5 mM EDTA, 

and 100 mg/ml denatured salmon sperm DNA. Membranes were blotted 
dry briefly then, without prehybridizat ion , 5 strips of membrane 
were placed in each of 2 plastic boxes containing 75 ml of "minimal 
hyb" and 2.6 ng/ml of radiolabeled HB14 probe. These were 

30 incubated overnight with slow shaking (50 rpm) at 60°C. The 

filters were washed three times for approximately 10 mm each at 
25°C in "minimal hyb wash solution" (0.25X SSC , 0.2% SDS), followed 
by two 30-min washes with slow shaking at 60°C in the same 
solution. The filters were placed on paper covered with Saran Wrap' 

35 (Dow Brands, Indianapolis, IN) in a light-tight autoradiographic 
cassette and exposed to X-Omat X-ray film (Kodak, Rochester, NY) 
with two DuPont Cronex Lightning-Plus CI enhancers (Sigma Chemical 
Co., St. Louis, MO), for 4 hr at -70°C. Upon development (standard 
photographic procedures) , significant signals were evident in both 

4 0 replicates amongst a high background of weaker, more irregular 

signals. The filters were again washed for about 4 hr at 68°C in 
"minimal hyb wash solution" and then placed again in the cassettes 
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and film wa^^posed overnight at -7D°C. Twelve possible positives 
were identified due to strong signals on both of the duplicate 96- 
well colony impressions. No signal was seen with negative control 
membranes (colonies of XL1 Blue MR cells containing pWE15) , and a 
5 very strong signal was seen with positive control membranes (DH5a 
cells containing the G24 isolate of the PCR product) that had been 
processed concurrently with the experimental samples. 

The twelve putative hybridization-positive colonies were 
retrieved from the frozen 96-well library plates and grown 

10 overnight at 3 7 °C on solid LB-Amp 100 medium. They were then patched 
(3 /plate, plus three negative controls: XL1 Blue MR cells 
containing the pWE15 vector) onto solid LB-Amp 1O0 . Two sets of 
membranes (Magna NT nylon, 0.45 micron) were prepared for 
hybridization. The first set was prepared by placing a filter 

15 directly onto the colonies on a patch plate, then removing it with 
adherent bacterial cells, and processing as below. Filters of the 
second set were placed on plates containing LB-Amp 1U(J medium, then 
inoculated by transferring cells from the patch plates onto the 
filters. After overnight growth at 37°C, the filters were removed 

20 from the plates and processed. 

Bacterial cells on the filters were lysed and DNA denatured by 
placing each filter colony- side -up on a pool (1.0 ml) of 0.5 N NaOH 
in a plastic plate for 3 min. The filters were blotted dry on a 
paper towel, then the process was repeated with fresh 0.5 n NaOH. 

2 5 After blotting dry, the filters were neutralized by placing each on 
a 1.0 ml pool of 1 M Tris-HCl, pH 7 . 5 for 3 min, blotted dry , and 
reneutralised with fresh buffer. This was followed by two similar 
soakings (5 mm each) on pools of 0 . 5 M Tris-HCl pH 7 . 5 plus 1.5 M 
NaCl . After blotting dry, the DNA was UV crosslinked to the filter 

30 (as above) , and the filters were washed (25°C, 100 rpm) in about 
100 ml of 3X SSC plus 0.1% (w/v) SDS (4 times, 3 0 min each with 
fresh solution for each wash) . They were then placed in a minimal 
volume of prehybridization solution [6X SSC plus 1% w/v each of 
Ficoll 400 (Pharmacia), polyvinylpyrrolidone (av. M.W. 360,000; 

35 Sigma ) and bovine serum albumin Fraction V; (Sigma)] for 2 hr at 
65°C, 50 rpm. The prehybridization solution was removed, and 
replaced with the HB14 32 P- labeled probe that had been saved from 
the previous hybridization of the library membranes and which had 
been denatured at 95 °C for 5 min. Hybridization was performed at 

4 0 60 °C for 16 hr with shaking at 50 rpm. 

Following removal of the labeled probe solution, the membranes 
were washed 3 times at 25°C (50 rpm, 15 min) in 3X SSC (about 150 
ml each wash) . They were then washed for 3 hr at 68 °C (50 rpm) in 
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0.25X SSC plus 0.2^BE)S (minimal hyb wash solutio^Pr and exposed to 
X-ray film as described above for 1.5 hr at 25°C (no enhancer 
screens) . This exposure revealed very strong hybridization signals 
to cosmid isolates 22G12, 25A10, 26A5, and 26B10, and a very weak 
signal with cosmid isolate 8B10 . No signal was seen with the 
negative control (pWE15) colonies, and a very strong signal was 
seen with positive control membranes (DH5a cells containing the GZ4 
isolate of the PCR product) that had been processed concurrently 
with the experimental samples. 

ftmpli fixation of a Specific Genomic Fragment of a TcaB Gene 

Based on the N- terminal amino acid sequence determined for the 
purified TcaB 1 peptide fraction (disclosed here as SEQ ID NO: 3) a 
pool of degenerate oligonucleotides (pool P8F) was synthesized as 
described for peptide TcaC. The determined amino acid sequence and 
the corresponding degenerate DNA sequence are given below, where A , 
C, G, and T are the standard DNA bases, and I represents mosine: 



Amino 

2 0 Acid Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg 

P8F 5' TTT ACI CA(A/G) ACI (C/T)TI AAA GAA GCI (A/C) G 3' 

(C/T)TI 

2 5 Another set of degenerate oligonucleotides was synthesized 

(pool P8.108.3R), representing the complement of the coding strand 
for the determined amino acid sequence of the TcaBi~PT108 internal 
peptide (disclosed herein as SEQ ID NO: 20) : 

2 0 Amino 

Acid Met Tyr Tyr lie Gin Ala Gin Gin 

Codons ATG TA(T/C) TA(T/C) AT(T/C/A) CA(A/G) GC(A/C/G/T) CA (A/G CA(A/G) 
P8.108.3R 3' AT (A/G) AT (A/G) TA (A/G/T) GT(T/C) CGI GT(T/C) GT 5' 

3 5 TAC 

These oligonucleotides were used as primers for PCR* using 
HotStart 50 Tubes™ (Molecular Bio-Products, Inc., San Diego, CA) to 
amplify a specific DNA fragment from genomic DNA prepared from 

4 0 Photorhabdus strain W-14 (see above) . A typical reaction (50 \xl) 

contained (bottom layer) 25 pmol of each primer pool P8F and 
P8.108.3R, with 2 nmol each of dATP , dCTP , dGTP, and dTTP, in IX 
GeneAmp^ PCR buffer, and (top layer) 230 ng of genomic template DNA, 
8 nmol each of dATP, dCTP, dGTP, and dTTP, and 2.5 units of 
4 5 AmpliTaq* DNA polymerase, in IX GeneAmp' PCR buffer. Amplifications 
were performed by 35 cycles as described for the TcaC peptide. 
Amplification products were analyzed by electrophoresis through 
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0.7% w/v Sec!^^ LE agarose { FMC) in. TEA buffer. A specific product 
of estimated size 16 0 0 bp was observed. 

Four such reactions were pooled, and the amplified DNA was 
extracted from a 1.0% SeaKem* LE gel by Qiaex kit as described for 
5 the TcaC peptide. The extracted DNA was used directly as the 

template for sequence determination ( PRISM*" Sequencing Kit) using 
the P8F and P8.108.3R primer pools. Each reaction contained about 
100 ng template DNA and 25 pmol of one primer pool, and was 
processed according to standard protocols as described for the TcaC 
10 peptide. An analysis of the sequence derived from extension of the 
P8F primers revealed the short DNA sequence (and encoded amino acid 
sequence) : 

GAT GCA TTG NTT GCT 
Asp Ala "Leu (Val) Ala 
15 which corresponds to a portion of the N- terminal peptide sequence 
disclosed as SEQ ID NO: 3 (TcaBi) . 



Labeling of a TcaB -i -peptide Gene-specific Probe 

Approximately 50 ng of gel -purified TcaBi DNA fragment was 

20 labeled with i2 P-dCTP as described above, and nonincorporated 
radioisotopes were removed by passage through a NICK Column^' 
(Pharmacia) . The specific activity of the labelled DNA was 
determined to be 6 x 10 9 dpm/ug. This labeled DNA was used to probe 
colony membranes prepared from members of the genomic library that 

25 had hybridized to the TcaC-peptide specific probe. 

The membranes containing the 12 colonies identified in the 
TcaC -probe library screen (see above) were stripped of radioactive 
TcaC-specif ic label by boiling twice for approximately 30 min each 
time in 1 liter of 0 . IX SSC plus 0.1 % SDS . Removal of radiolabel 

30 was checked with a 6 hr film exposure. The stripped membranes were 
then incubated with the TcaBi peptide-specif ic probe prepared 
above. The labeled DNA was denatured by boiling for 10 min, and 
then added to the filters that had been incubated for 1 hr in 100 
ml of "minimal hyb" solution at 60°C. After overnight 

35 hybridization at this temperature, the probe solution was removed, 
and the filters were washed as foil ows (all in 0 . 3X SSC plus 0.1% 
SDS) : once for 5 min at 25°C, once for 1 hr at 60°C in fresh 
solution, and once for 1 hr at 63°C in fresh solution. After 1.5 
hr exposure to X-ray film by standard procedures, 4 strongly- 

4 0 hybridizing colonies were observed. These were, as with the TcaC- 
specif ic probe, isolates 22G12, 25A10, 26A5, and 26B10. 
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(about 100 ml) of "minimal hyb" solution, and then used to screen 
the membranes containing the 800 members of the genomic library. 
After hybridization, washing, and exposure to X-ray film as 
described above, only the four cosmid clones 22G12, 25A10, 26A5, 
and 26B10, were found to hybridize strongly to this probe. 

Tsolation of Subclones Containing Genes Encoding TcaC and YcaBi 
PppHrifis , and Determination of DNA Base Sequence Thereof 

Three hybridization-positive cosmids in strain XL1 Blue MR 
were grown with shaking overnight (200 rpm) at 30°C in 100 ml TB- 
Amp 100 . After harvesting the cells by centrif ugation, cosmid DNA 
was prepared using a commercially available kit (BIGprep™, 5 Prime 
3 Prime, Inc., Boulder, CO), following the manufacturer's 
protocols. Only one cosmid, 26A5, was successfully isolated by 
this procedure. When digested with restriction enzyme EcoR 1 (NEB) 
and analyzed by gel electrophoresis, fragments of approximate sizes 
14, 10, 8 (vector), 5, 3.3, 2.9, and 1.5 kbp were detected. A 
second attempt to isolate cosmid DNA from the same three strains (8 
ml cultures; TB-Amp 100 , 30°C) utilized a boiling miniprep method 
(Evans G. and G. Wahl . , 1987, "Cosmid vectors for genomic walking 
and rapid restriction mapping." in Guide to Molecular Cloning 
Techniques. Meth. Enzymoloay . Vol. 152, S. Berger and A. Kimmel , 
eds . , pgs . 604-610). Only one cosmid, 25A10, was successfully 
isolated by this method. When digested with restriction enzyme 
EcoR' I (NEB) and analyzed by gel electrophoresis, this cosmid 
showed a fragmentation pattern identical to that previously seen 
with cosmid 26A5. 

A 0.15 ng sample of 26A5 cosmid DNA was used to transform 50 
ml of E. coli DH5a cells (Gibco BRL) , by the supplier's protocols. 
A single colony isolate of that strain was inoculated into 4 ml of 
TB-Amp 100 , and grown for 8 hr at 37°C. Chloramphenicol was added to 
a final concentration of 225 ng/ml , incubation was continued for 
another 24 hr, then cells were harvested by centrif ugation and 
frozen at -20°C. Isolation of the 26A5 cosmid DNA was by a 
standard alkaline lysis miniprep (Maniatis et al . , op. ci t . , 
p. 3 82) , modified by increasing all volumes by 50% and with 
stirring or gentle mixing, rather than vortexing, at every step. 
After washing the DNA pellet in 70% ethanol , it was dissolved in TE 
containing 25 fjg/ml ribonuclease A (Boehringer Mannheim) . 
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Itient i f i catroh Qt Eo^R^L Fragments Hybridizing to GZ 4-derived and 

TcaBj_- Probes 

Approximately 0.4 ng of cosmid 25A10 (from XL1 Blue MR cells) 
a--d about 0.5 ^g of cosmid 26A5 (from chloramphenicol -amplified 
I -5a cells) were each digested with about 15 units of EcoJ? I (NEB) 
for 85 min, frozen overnight, then heated at 6S°C for five min, and 
electrophoresed in a 0.7% agarose gel (Seakem* LE, IX TEA, 80 volts, 
90 min) . The DNA was stained with ethidium bromide as described 
above, and photographed under ultraviolet light. The EcoR I digest 
of cosmid 25A10 was a complete digestion, but the sample of cosmid 
26A5 was only partially digested under these conditions. The 
agarose gel containing the DNA fragments was subjected to 
depurination, denaturation and neutralizatxon , followed by Southern 
blotting onto a Magna NT nylon membrane, using a high salt (20X 
SSC) protocol, all as described in section 2 . 9 of Ausubel et al . 
(CPMB, op. cit.) . The transferred DNA was then UV-crosslinked to 
the nylon membrane as before. 

An TcaC-peptide specific DNA fragment corresponding to the 
insert of plasmid isolate GZ4 was amplified by PCR* in a 100 ml 
reaction volume as described previously above. The amplification 
products from three such reactions were pooled and were extracted 
from a 1% GTG' agarose gel by Qiaex kit, as described above, and 
quantitated by fluorometry. The gel-purified DNA (100 ng) was 
labeled with 32 P-dCTP using the High Prime Labeling Mix (Boehrmger 
Mannheim) as described above, to a specific activity of 6.34 x 10 8 
dpm/ ng . 

The 32 P- labeled GZ4 probe was boiled 10 min, then added to 
"minimal hyb" buffer (at 1 ng/ml) , and the Southern blot membrane 
containing the digested cosmid DNA fragments was added, and 
incubated for 4 hr at 60°C with gentle shaking at 50 rpm. The 
membrane was then washed 3 times at 25 °C for about 5 min each 
(minimal hyb wash solution) , followed by two washes for 3 0 min each 
at 60°C. The blot was exposed to film {with enhancer screens) for 
about 3 0 min at -70°C. The G24 probe hybridized strongly to the 
5.0 kbp (apparent size) EcoR I fragment of both these two cosmids, 
2 6A5 and 2 5A10. 

The membrane was stripped of radioactivity by boiling for 
about 3 0 min in 0 . IX SSC plus 0.1 % SDS , and absence of radiolabel 
was checked by exposure to film. it was then hybridized at 60°C 
for 3.5 hours with the (denatured) TcaBi probe in "minimal hyb" 
buffer previously used for screening the colony membranes (above) , 
washed as described previously, and exposed to film for 4 0 mm at - 
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hybridized lightly with the about 5.0 kbp EcoR 1 fragment, and 
strongly with a fragment of approximately 2.9 kbp. 

The sample of cosmid 26A5 DNA previously described, (from DH5a 
cells) was used as the source of DNA from which to subclone the 
bands of interest. This DNA (2.5 ug) was digested with about 3 
units of EcoR I (NEB) in a total volume of 30 ul for 1.5 hr, to 
give a partial digest, as confirmed by gel electrophoresis. Ten ug 
of pBC KS (+) DNA (Stratagene) were digested for 1 . 5 hr with 20 
units of EcoR J in a total volume of 2 0 ul , leading to total 
digestion as confirmed by electrophoresis. Both EcoR I-cut DNA 
preparations were diluted to 50 ul with water, to each an equal 
volume of PCI was added, the suspension was gently mixed, spun in a 
microcentrifuge and the aqueous supernatant was collected. DNA was 
precipitated by 150 ul ethanol , and the mixture was placed at -20°C 
overnight. Following centrif ugation and drying, the EcoR I - 
digested pBC KS (+) was dissolved in 100 ul TE; the partially 
digested 26A5 was dissolved in 20 ul TE . DNA recovery was checked 
by fluorometry. 

In separate reactions, approximately 60 ng of EcoR I -digested 
pBC KS(+) DNA was ligated with approximately 180 ng or 270 ng of 
partially digested cosmid 26A5 DNA. Ligations were carried out in 
a volume of 20 ul at 15 °C for 5 hr, using T4 ligase and buffer from 
New England BioLabs. The ligation mixture, diluted to 100 ul with 
sterile TE, was used to transform frozen, competent DH5a cells 
(Gibco BRL) according to the supplier's instructions. Varying 
amounts (25-200 ul ) of the transformed cells were plated on freshly 
prepared solid LB -Cam 35 medium with 1 mM IPTG and 50 mg/1 X-gal . 
Plate's" were" incubated at 37°C about 20 hr, then chilled in the dark 
for approximately 3 hr to intensify color for insert^selection. 
White colonies were picked onto patch plates of the same 
composition and incubated overnight at 37°C\ 

Two colony lifts of each of the selected patch plates were 
prepared as follows. After picking white colonies to fresh plates, 
round Magna NT nylon membranes were pressed onto the patch plates, 
the membrane was lifted off, and subjected to denaturation, 
neutralization and UV crosslinking as described above for the 
library colony membranes. The crosslinked colony lifts were 
vigorously washed, including gently wiping off the excess cell 
debris with a tissue. One set was hybridized with the GZ4 (TcaC) 
probe solution described earlier, and the other set was hybridized 
with the TcaBi probe solution described earlier, according to the 
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Minimal hy^^^>rotocol , followed by .washing and film exposure as 
described for the library colony membranes. 

Colonies showing hybridization signals either only with the 
GZ4 probe, with both GZ4 and TcaBi probes, or only with the TcaBi 
5 probe, were selected for further work and cells were streaked for 
single colony isolation onto L.B-Cam 3 ,, media with IPTG and X-gal as 
before. Approximately 35 single colonies, from 16 different 
isolates, were picked into liquid LB-Cam 35 media and grown overnight 
at 37°C; the cells were collected by centrif ugation and plasmid DNA 

10 was isolated by a standard alkaline lysis miniprep according to 

Maniatis et al . (op. cit. p. 368). DNA pellets were dissolved in 
TE + 25 ug/ml ribonuclease A and DNA concentration was determined 
by fluorometry. The EcoR I digestion pattern was analyzed by gel 
electrophoresis. The following isolates were picked as useful. 

15 Isolate A17.2 contains religated pBC KS ( + ) only and was used for a 
(negative) control. Isolates D38.3 and C44.1 each contain only the 
2.9 kbp, TcaBi -hybridizing EcoR I fragment inserted into pBC 
KS(+>. These plasmids, named pDAB2000 and pDAB2001, respectively, 
are illustrated in Fig. 2. 

20 Isolate A35.3 contains only the approximately 5 kbp, GZ4 ) - 

hybridizing EcoR 1 fragment, inserted into pBC KS(+). This plasmid 
was named pDAB2002 (also Fig. 2) . These isolates provided 
templates for DNA sequencing. 

Plasmids pDAB2000 and pDAB2001 were prepared using the 

2 5 BIGprep™ kit as before. Cultures (30 ml) were grown overnight in 

TB-Cam 35 to an OD G00 of 2, then plasmid was isolated according to the 
manufacturer's directions. DNA pellets were redissolved in 100 fxl 
TE each, and sample integrity was checked by EcoR I digestion and 
gel electrophoretic analysis. 

30 Sequencing reactions were run in duplicate, with one replicate 

using as template pDAB2000 DNA, and the other replicate using as 
template pDAB2001 DNA. The reactions were carried out using the 
dideoxy dye terminator cycle sequencing method, as described above 
for the sequencing of the GZ4/HB14 DNAs . Initial sequencing runs 

35 utilized as primers the LacZ and T7 primers described above, plus 
primers based on the determined sequence of the TcaB t PCR 
amplification product (TH1 = ATTGCAGACTGCCAATCGCTTCGG , TH12 = 
GAGAGTATCCAGACCGCGGATGATCTG) . 

After alignment and editing of each sequencing output, each 

4 0 was truncated to between 250 to 3 50 bases, depending on the 

integrity of the chromatographic data as interpreted by the Perkin 
Elmer Applied Biosystems Division SeqEd 675 software. Subsequent 
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new primers. With a few exceptions, primers (synthesized as 
described above) were 24 bases in length with a 50% G+C 
composition. Sequencing by this method was carried out on both 
strands of the approximately 2.9 kbp EcoR I fragment. 

To further serve as template for DNA sequencing, plasmid DNA 
from isolate pDAB2002 was prepared by BIGprep™ kit. Sequencing 
reactions were performed and analyzed as described above. 
Initially, a T3 primer (pBS SK (+) bases 774-796: 

CGCGCAATTAACCCTCACTAAAG) and a T7 primer (pBS KS (+) bases 621-643: 
GCGCGTAATACGACTCACTATAG) were used to prime the sequencing 
reactions from the flanking vector sequences, reading into the 
insert DNA. Another set of primers, (G24F: 

GTATCGATTACAACGCTGTCACTTCCC ; TH13 : GGGAAGTGAC AG CGTTGT AAT CGATAC ; 
TH 1 4 : ATGTTGGGTG CGT CGG CTAATGGAC AT AAC ; and LW 1 - 2 0 4 : 
GGGAAGTGAC AGCGTTGTAATCGATAC ) was made to prime from internal 
sequences, which were determined previously by degenerate 
oligonucleotide-mediated sequencing of subcloned TcaC-peptide PCR 
products. From the data generated during the initial rounds of 
sequencing, new sets of primers were designed and used to walk the 
entire length of the about 5 kbp fragment. A total of 55 oligo 
primers was used, enabling the identification of 4832 total bp of 
contiguous sequence . 

When the DNA sequence of the EcoR I fragment insert of 
pDAB2002 is combined with part of the determined sequence of the 
pDAB200 0/pDAB2 0 01 isolates, a total contiguous sequence of 6005 bp 
was generated (disclosed herein as SEQ ID NO:25) . When long open 
reading frames were translated into the corresponding amino acids, 
the sequence clearly shows the TcaBi N-terminal peptide (disclosed 
as SEQ ID NO:3), encoded by bases 68-124, immediately following a 
methionine residue (start of translation) . Upstream lies a 
potential ribosome binding site (bases 51-58) , and downstream, at 
bases 215-277 is encoded the TcaBi~PT158 internal peptide 
(disclosed herein as SEQ ID NO:19). Further downstream, in the 
same reading frame, at bases 1787-1822, exists a sequence encoding 
the TcaBi-PT108 internal peptide (disclosed herein as SEQ ID 
NO:20) . Also in the same reading frame, at bases 1946-1972, is 
encoded the TcaBii N-terminal peptide (disclosed herein -as SEQ ID 
NO:5), and the reading frame continues uninterrupted to a 
translation termination codon at nucleotides 3632-3634. 

The lack of an in- frame stop codon between the end of the 
sequence encoding TcaB 1 -PT108 and the start of the TcaBii encoding 
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region, and^ffe lack of a discernible ribosome binding site 
immediately upstream of the TcaBii coding region, indicate that 
peptides TcaBii and TcaBi are encoded by a single open reading 
frame of 3 567 bp beginning at base pair 65 in SEQ ID NO: 25) , and 
are most likely derived from a single primary gene product TcaB of 
1189 amino acids (131,586 Daltons; disclosed herein as SEQ ID 
NO:26) by post -translational cleavage. If the amino acid 
immediately preceding the TcaBii N- terminal peptide represents the 
C-terminal amino acid of peptide TcaBi, then the predicted mass of 
TcaBii (627 amino acids) is 70,814 Daltons (disclosed herein as SEQ 
ID NO:28), somewhat higher than the size observed by SDS-PAGE (68 
kDa) . This peptide would be encoded by a contiguous stretch of 
1881 base pairs (disclosed herein as SEQ ID NO: 27) . It is thought 
that the native C- terminus of TcaBi lies somewhat closer to the C- 
terminus of TcaBi -PT108. The molecular mass of PT108 [3.438 kDa; 
determined during N- terminal amino acid sequence analysis of this 
peptide) predicts a size of 30 amino acids. Using the size of this 
peptide to designate the C-terminus of the TcaBi coding region [Glu 
at position 604 of SEQ ID NO:28], the derived size of TcaBi is 
determined to be 604 amino acids or 68,463 Daltons, more in 
agreement with experimental observations. 

Translation of the TcaBii peptide coding region of 1686 base 
pairs (disclosed herein as SEQ ID N0:29) yields a protein of 562 
amino acids (disclosed herein as SEQ ID NO: 30) with predicted mass 
of 60,789 Daltons, which corresponds well with the observed 61 kDa. 

A potential ribosome binding site (bases 3682-3687) is found 
48 bp downstream of the stop codon for the tcaB open reading frame. 
At bases 3694-3726 is found a sequence encoding the N-terminus of 
peptide TcaC, (disclosed as SEQ ID NO. 2) . The open reading frame 
initiated by this N-terminal peptide continues uninterrupted to 
base 6005 (2361 base pairs, disclosed herein as the first 2361 base 
pairs of SEQ ID NO. 31) . A gene itcaC) encoding the entire TcaC 
peptide, (apparent size about 165 kDa; about 1500 amino acids), 
would comprise about 4500 bp. 

Another isolate containing cloned EcoR I fragments of cosmid 
26A5, E20.6, was also identified by its homology to the previously 
mentioned GZ4 and TcaBi probes. Agarose gel analysis of EcoR I 
digests of the DNA of the plasmid harbored by this strain 
(pDAB2004 , Fig. 2), revealed insert fragments of estimated sizes 
2.9, 5, and 3.3 kbp . DNA sequence analysis initiated from primers 
designed from the sequence of plasmid pDAB2002 revealed that the 
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J fragment represented in pDAB2002. The 2361 base pair open 
reading frame discovered in pDAB2 002 continues uninterrupted for 
another 2094 bases in pDAB2004 [disclosed herein as base pairs 2362 
to 4458 of SEQ ID NO: 31] . DNA sequence analysis using the parent 
cosmid 26A5 DNA as template confirmed the continuity of the open 
reading frame. Altogether, the open reading frame ( tcaC SEQ ID 
NO: 31) comprises 4455 base pairs, and encodes a protein (TcaC) of 
1485 amino acids [disclosed herein as SEQ ID NO: 32] . The 
calculated molecular size of 166,214 Daltons is consistent with the 
estimated size of the TcaC peptide (165 kDa) , and the derived amino 
acid sequence matches exactly that disclosed for the TcaC N- 
terminal sequence [SEQ ID NO: 2] . 

The lack of an amino acid sequence corresponding to SEQ ID 
NO: 17; used to design the degenerate oligonucleotide primer pool in 
the discovered sequence indicates that the generation of the PCR 0 
products found in isolates GZ4 and HB14, which were used as probes 
in the initial library screen, were fortuitously generated by 
reverse -strand priming by one of the primers in the degenerate 
pool. Further, the derived protein sequence does not include the 
internal fragment disclosed herein as SEQ ID NO:18. These 
sequences reveal that plasmid pDAB2004 contains the complete coding 
region for the TcaC peptide. 

Further analysis of SEQ ID NO: 25 reveals the end of an open 
reading frame (bases 1-43), which encodes the final 13 amino acids 

of the TcaA i:Li peptide, disclosed herein as SEQ ID NO: 35. Only 24 
bases separate the end of the TcaA^ coding region and the start of 
the TcaBi coding region. Included within the 24 bases are 

sequences that may serve as a ribosome binding site. Although 
possible, it is not likely that a Photorhabdus gene promoter is 
encoded within this short region. We propose that genomic region 
tea, which includes three long open reading frames [ tcaA (SEQ ID 
NO:33), tcafi (SEQ ID NO:25, bases 65-36334), and tcaC (SEQ ID 
NO: 31) , which is separated from the end of tcaB by only 59 bases] is 
regulated as an operon, with transcription initiating upstream of 
the start of the tcaA gene (SEQ ID NO:33), and resulting in a 
polycistronic messenger RNA. 
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Example 9 

Screening q£ the Phpfrorhafrdus Genome Library 

for Genes Encoding the TcbAjj^ Peptide 

5 This example describes a method used to identify DNA clones 

that contain the TcbAi ± peptide-encoding genes, the isolation of 
the gene, and the determination of its partial DNA base sequence. 

Erimers and ECB Reactions 

10 The TcbAii polypeptide of the insect active preparation is 

about 206 kDa. The amino acid sequence of the N-terminus of this 
peptide is disclosed as SEQ ID N0:1. Four pools of degenerate 
oligonucleotide primers ("Forward primers": TH-4, TH-5, TH-6, and 
TH-7) were synthesized to encode a portion of this amino acid 

15 sequence, as described in Example 8, and are shown below. 



20 



Amino 
Acid Phe 



Table 12 



He Gin Gly Tyr Ser Asp Leu Phe 

TH-4 5'-TT(T/C) ATI CA(A/G) GGI TA<T/C) TCI GA(T/C) CTI TT- 



25 



TH-5 5'~TT(T/C) ATI CA(A/G) GGI TA(T/C) AG(T/C) GA{T/C) CTI TT- 
3 ' 

TH-6 5'~TT(T/C) ATI CA(A/G) GGI TA(T/C) TCI GA(T/C) TT (A/G) TT- 

3 ' 

TH-7 5 ' -TT (T/C ) ATI CA(A/G) GGI TA(T/C) AG(T/C) GA(T/C) TT (A/G) TT- 
3' 



In addition, a primary ("a") and a secondary ("b") sequence of 
30 an internal peptide preparation (TcbAii -PT81) have been determined 
and are disclosed herein as SEQ ID NO: 23 and SEQ ID NO: 24, 
respectively. Four pools of degenerate oligonucleotides ("Reverse 
Primers": TH-8, TH-9, TH-10 and TH-11) were similarly designed and 
synthesized to encode the reverse complement of sequences that 
3 5 encode a portion of the peptide of SEQ ID NO: 23, as shown below. 
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Sets <^^hese primers were used in PCR** reactions to amplify 
TcbAii- encoding gene fragments from the genomic Photorhabdus 
lumlnescens W-14 DNA prepared in Example 6. All PCR* reactions were 
run with the "Hot Start" technique using AmpliWax™ gems and other 
5 Perkin Elmer reagents and protocols. Typically, a mixture (total 
volume 11 p.1) of MgCl 2 , dNTP's, 10X GeneAmp" PCR Buffer II, and the 
primers were added to tubes containing a single wax bead. [10X 
GeneAmp* PCR Buffer II is composed of 100 mM Tris-HCl, pH 8.3; and 
500 mM KC1 . ] The tubes were heated to 80°C for 2 minutes and 

10 allowed to cool. To the top of the wax seals, a solution 

containing 10X GeneAmp" PCR Buffer II, DNA template, and AmpliTaq* 
DNA polymerase were added. Following melting of the wax seal and 
mixing of components by thermal cycling, final reaction conditions 
(volume of 50 nl) were: 10 mM Tris-HCl, pH 8.3; 50 mM KC1 ; 2.5 mM 

15 MgCl 2 ; 200 >iM each in dATP, dCTP, dGTP, dTTP; 1.25 mM in a single 

Forward primer pool; 1.2 5 jxM in a single Reverse primer pool, 1.25 
units of AmpliTaq* DNA polymerase, and 170 ng of template DNA. 

The reactions were placed in a thermocycler (as in Example 8) 
and run with the following program: 

20 



Table 14 


Temperature 


Time 


Cycle Repetition 


94°C 


2 minutes 


IX 








94°C 


15 seconds 




55-65°C 


3 0 seconds 


3 OX 


72°C 


1 minute 






72°C 


/ minutes 


IX 




15°C 


Constant 





2 5 A series of amplifications was run at three different 

annealing temperatures (55°, 60°, 65°C) using the degenerate primer 
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35 



40 



annealing at .65 °C had no am^^^J 



pools. Reactions w^h annealing at .65°C had no am^Rf ication 
products visible following agarose gel electrophoresis. Reactions 
having a 60 °C annealing regime and containing primers TH-5+TH-10 
produced an amplification product that had a mobility corresponding 
5 to 2.9 kbp. A lesser amount of the 2.9 kbp product was produced 
under these conditions with primers TH-7+TH-10. When reactions 
were annealed at 55°C, these primer pairs produced more of the 2.9 
kbp product, and this product was also produced by primer pairs TH- 
5+TH-8 and TH-5+TH-11. Additional very faint 2.9 kbp bands were 

10 seen in lanes containing amplification products from primer pairs 
TH-7 plus TH-8, TH-9, TH-10, or TH-11. 

To obtain sufficient PCR amplification product for cloning and 
DNA sequence determination, 10 separate PCR reactions were set up 
using the primers TH- 5+TH- 10 , and were run using the above 

15 conditions with a 55°C annealing temperature. All reactions were 
pooled and the 2.9 kbp product was purified by Qiaex extraction 
from an agarose gel as described above. 

Additional sequences determined for TcbAii internal peptides 
are disclosed herein as SEQ ID NO: 21 and SEQ ID NO: 22. As before, 

20 degenerate oligonucleotides (Reverse primers TH-17 and TH-18) were 
made corresponding to the reverse complement of sequences that 
encode a portion of the amino acid sequence of these peptides. 

Table 15 

25 Fro m SE Q I P NQ; 21 

Amino 

Acid Met Glu Thr Gin Asn lie Gin Glu Pro 

30 TH-17 3'-TAC CTT/C TGI GTT/C TTA/G TAI GTT/C GTT/C GG-5' 



Table 16 
From SEQ ID NO; 22 

Amino 

Acid Asn Pro lie Asn lie Asn Thr Gly lie Asp 



TH-18 3'-TT(A/G) GGI TAI TT (A/G) TAI TT(A?G) TGI CCI TAI CT(A/G)-5' 

Degenerate oligonucleotides TH-18 and TH-17 were used in an 
amplification experiment with Photzorhabdus luminescens W-14 DNA as 
template and primers TH-4, TH-5, TH-6, or TH-7 as the 5'- (Forward) 
primers. These reactions amplified products of approximately 4 kbp 
45 and 4.5 kbp, respectively. These DNAs were transferred from 

agarose gels to nylon membranes and hybridized with a 32 P- labeled 
probe (as described above) prepared from the 2.9 kbp product 
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amplified b^The TH-S+TH10 primer pair. Both the 4 kbp and the 4.5 
kbp amplification products hybridized strongly to the 2.9 kbp 
probe. These results were used to construct a map ordering the 
TcbAii internal peptide sequences as shown in Fig. 3. Approximate 
5 distances between the primers are shown in nucleotides in Fig. 3. 

DNA Sequence of the 2.9 kbp TcbAj j -encoding Fragment 

Approximately 200 ng of the purified 2.9 kbp fragment 
(prepared above) was precipitated with ethanol and dissolved in 17 

10 ml of water. One-half of this was used as sequencing template with 
25 pmol of the TH-5 pool as primers, the other half was used as 
template for TH-10 priming. Sequencing reactions were as given in 
Example 8. No reliable sequence was produced using the TH-10 
primer pool; however, reactions with TH-5 primer pool produced the 

15 sequence disclosed below: . 

1 AATCGTGTTG ATCCCTATGC CGNGCCGGGT TCGGTGGAAT CGATGTCCTC ACCGGGGGTT 
61 TATTNGAGGG ANTNGTCCCG TGAGGCCAAA AANTGGAATG AAAGAAGTTC AATTTNTTAC 
121 CTAGATAAAC GTCGCCCGGN TTTAGAAAGN TTANTGNTCA GCCAGAAAAT TTTGGTTGAG 
161 GAAATTCCAC CGNTGGTTCT CTCTATTGAT TNGGGCCTGG CCGGGTTCGA ANNAAAACNA 
20 241 GGAAATNCAC AAGTTGAGGT GATGGNTTTG TNGCNANCTT NTCGTTTAGG TGGGGAGAAA 
3 01 CCTTNTCANC ACGNTTNTGA AACTGTCCGG GAAATCGTCC ATGANCGTGA NCCAGGNTTN 
3 61 CGCCATTGG 

Based on this sequence, a sequencing primer (TH-21, 5'- 
25 CCGGGCGACGTTTATCTAGG-3 1 ) was designed to reverse complement bases 

120-139, and initiate polymerization towards the 5* end (i.e., TH-5 
end) of the gel-purified 2.9 kbp TcbAii -encoding PCR fragment. The 
determined sequence is shown below, and is compared to the 
biochemically determined N- terminal peptide sequence of TcbAii SEQ 
30 ID N0-.1. 

TcbAjH 2.9 kbp PCR Fragment Sequence Confirmation 

[Underlined amino acids = encoded by degenerate oligonucleotides] 
3 5 SEQ ID NO:l FIOGYSDLF G - - A 



2 . 9 kbp seq GC ATG CAG GGG TAT AGT GAC CTG TTT GGT AAT CGT GCT 

MQGYSDLFGNRA> 

4 0 From the homology of the derived amino acid sequence to the 

biochemically determined one, it is clear that the 2.9 kbp PCR 
fragment represents the TcbA coding region. This 2.9 kbp fragment 
was then used as a hybridization probe to screen the Photorhabdus 
W-14 genomic library prepared in Example 8 for cosmids containing 

45 the TcbAii -encoding gene. 
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ci^^^rtina the Phot rttabdus Cosmid Library 

The 2.9 kb gel-purified PCR fragment was labeled with 32 P using 
the Boehringer Mannheim High Prime labeling kit as described in 
Example 8. Filters containing remnants of approximately 800 
5 colonies from the cosmid library were screened as described 
previously (Example 8) , and positive clones were streaked for 
isolated colonies and rescreened. Three clones (8A11, 25G8, and 
2 6D1) gave positive results through several screening and 
characterization steps. No hybridization of the TcbAi i -specif ic 

10 probe was ever observed with any of the four cosmids identified in 
Example 8, and which contain the tcaB and tcaC genes. DNA from 
cosmids 8A11, 25G8 , and 26D1 was digested with restriction enzymes 
Bgl II, EcoR I or Hind III (either alone or in combination with one 
another) , and the fragments were separated on an agarose gel and 

15 transferred to a nylon membrane as described in Example 8. The 

membrane was hybridized with 32 P-labeled probe prepared from the 4.5 
kbp fragment (generated by amplification of Photorhabdus genomic 
DNA with primers TH-5+TH-17) . The patterns generated from cosmid 
DNAs 8A11 and 26D1 were identical to those generated with 

20 similarly-cut genomic DNA on the same membrane. It is concluded 
that cosmids 8A11 and 26D1 are accurate representations of the 
genomic TcbAn encoding locus. However, cosmid 25G8 has a single 
Bgl II fragment which is slightly larger than the genomic DNA. 
This may result from positioning of the insert within the vector. 

25 

DNA Seouenrfi of t he tcbA- encoding Gene 

The membrane hybridization analysis of cosmid 26D1 revealed 
that the 4.5 kbp probe hybridized to a single large EcoR I 
fragment (greater than 9 kbp) . This fragment was gel purified and 

30 ligated into the EcoR I site of pBC KS (+) as described in Example 
8, to generate plasmid pBC-Sl/Rl. The partial DNA sequence of the 
insert DNA of this plasmid was determined by "primer walking" from 
the flanking vector sequence, using procedures described in Example 
8. Further sequence was generated by extension from new 

35 oligonucleotides designed from the previously determined sequence. 
When compared to the determined DNA sequence for the tcJbA gene 
identified by other methods (disclosed herein as SEQ ID NO: 11 as 
described in Example 12 below) , complete homology was found to 
nucleotides 1-272, 319-826, 2578-3036, and 3068-3540 (total bases = 

4 0 1712) . It was concluded that both approaches can be used to 
identify DNA fragments encoding the TcbAii peptide. 
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10 



15 



20 



25 



30 



35 



Analysis of TPe Derived Amino Acid Sequence of the tcbA Gene 

The sequence of the DNA fragment identified as SEQ ID NO -.11 
encodes a protein whose derived amino acid sequence is disclosed 
herein as SEQ ID NO: 12. Several features verify the identity of 
the gene as that encoding the TcbAii protein. The TcbAii N- 
terminal peptide (SEQ ID NO:l; Phe lie Gin Gly Tyr Ser Asp Leu Phe 
Gly Asn Arg Ala) is encoded as amino acids 88-100. The TcbAii 
internal peptide TcbAii -PT81 (a) (SEQ ID NO: 23) is encoded as amino 
acids 1065-1077, and TcbAii -PT81 (b) (SEQ ID NO: 24) is encoded as 
amino acids 1571-1592. Further, the internal peptide TcbAii~PT56 
(SEQ ID NO:22) is encoded as amino acids 1474-1488, and the 
internal peptide TcbAii -PT103 (SEQ ID NO: 21) is encoded as amino 
acids 1614-1639. It is obvious that this gene is an authentic 
clone encoding the TcbAii peptide as isolated from insecticidal 
protein preparations of Photorhabdus luminescens strain W-14. 

The protein isolated as peptide TcbAii is derived from 
cleavage of a longer peptide. Evidence for this is provided by the 
fact that the nucleotides encoding the TcbAii N-terminal peptide 
SEQ ID NO:l are preceded by 261 bases (encoding 87 N-terminal- 
proximal amino acids) of a longer open reading frame (SEQ ID 
NO: 11). This reading frame begins with nucleotides that encode the 
amino acid sequence Met Gin Asn Ser Leu, which corresponds to the 
N-terminal sequence of the large peptide TcbA, and is disclosed 
herein as SEQ ID NO:16. It is thought that TcbA is the precursor 
protein for TcbAii . 

Relationship of tcfcA. t.caB and ccaC Genes 

The tcaB and tcaC genes are closely linked and may be 
transcribed as a single mRNA (Example 8) . The tcbA gene is borne 
on cosmids that apparently do not overlap the ones harboring the 
tcaB and tcaC cluster, since the respective genomic library screens 
identified different cosmids. However, comparison of the amino 
sequences encoded by the tcaB and tcaC genes with the tcbA gene 
reveals a substantial degree of homology. The amino acid 
conservation (Protein Alignment Mode of MacVector™ Sequence 
Analysis Software, scoring matrix pam250, hash value = 2; Oxford 
Molecular Group, Campbell, CA) is shown in Fig. 4. On the score 
line of each panel in Fig. 4, up carats <*) indicate homology or 
conservative amino acid changes, and down carats (v) indicate 
nonhomology . 
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i ^fows that the amino acid sequ^Pre 



This analysis otIows that the amino acid sequ^Pre of the TcbA 
peptide from residues 1739 to 1894 is highly homologous to amino 
acids 441 to 603 of the TcaB^ peptide (162 of the total 627 amino 

acids of TcaB; SEQ ID NO: 28) . In addition, the sequence of TcbA 
5 amino acids 1932 to 2459 is highly homologous to amino acids 12 to 
531 of peptide TcaBii (520 of the total 562 amino acids; SEQ ID 
NO:30) . Considering that the TcbA peptide (SEQ ID NO:12) comprises 
2505 amino acids, a total of 684 amino acids (27%) at the C- 
proximal end of it is homologous to the TcaBi or TcaBii peptides, 
10 and the homologies are arranged colinear to the arrangement of the 
putative TcaB preprotein (SEQ ID NO: 26) . A sizeable gap in the 
TcbA homology coincides with the junction between the TcaB^ and 

TcaBii portions of the TcaB preprotein. Clearly the TcbA and TcaB 
gene products are evolut ionarily related, and it is proposed that 
15 they share some common function (s) in Photorhabdus . 



Example 10 

Characterization of Zinc -metallooroteases in Photorhab dus Broth: 
Protease Inhibition. Classification, and Purification 

20 

Protease Inhibition and Classification Assays: Protease 
assays were performed using FITC- casein dissolved in water as 
substrate (0.08% final assay concentration). Proteolysis reactions 
were performed at 25°C for 1 h in the appropriate buffer with 25 jil 

2 5 of Photorhabdus broth (150 i±l total reaction volume) . Samples were 
also assayed in the presence and absence of dithiothrei tol . After 
incubation, an equal volume of 12% trichloroacetic acid was added 
to precipitate undigested protein. Following precipitation for 0.5 
h and subsequent centrif ugat ion, 100 jxl of the supernatant was 

30 placed into a 96-well microtiter plate and the pH of the solution 
was adjusted by addition of an equal volume of 4N NaOH. 
Proteolysis was then quantitated using a Fluoroskan II fluoromecric 
plate reader at excitation and emission wavelengths of 485 and 538 
nm, respectively. Protease activity was tested over a range from 

35 pH 5.0-10.0 in 0.5 units increments. The following buffers were 

used at 50 mM final concentration: sodium acetate (pH 5.0 - 6.5); 
Tris-HCL (pH 7.0 - 8.0); and bis-Tris propane (pH 8.5-10.0). To 
identify the class of protease (s) observed, crude broth was treated 
with a variety of protease inhibitors (0.5 fig/^il final 

4 0 concentration) and then examined for protease activity at pH 8.0 
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using the sJSRrate described above. The protease inhibitors used 
included E-64 (L- trans -expoxysacc inylleucyl ami do [4 - , -guanidino] - 
butane), 3,4 dichloroisocoumarin, Leupeptin, pepstatin, amastatin, 
ethylenediaminetetraacetic acid (EDTA) and 1,10 phenanthroline . 
5 Protease assays performed over a pH range revealed that indeed 

protease (s) were present which exhibited maximal activity at about 
pH 8.0 (Table 17) . Addition of DTT did not have any effect on 
protease activity. Crude broth was then treated with a variety of 
protease inhibitors (Table 18) . Treatment of crude broth with the 
10 inhibitors described above revealed that 1,10 phenanthroline caused 
complete inhibition of all protease activity when added at a final 
concentration of 50 ^9/ with the IC50 = 5 ^9 in 100 pi of a 2 mg/ml 

crude broth solution. These data indicate that the most abundant 
protease (s) found in the Photorhabdus broth are from the zinc- 
15 metalloprotease class of enzymes. 

Table 17 

Effect Qt pH on the Protease Activity Found in. a, pay 1 Production 

of Photorhabdus luminescens (Strain W-14 1 

20 

pH Flu. Units a Percent 

Activity^ 



30 



35 



40 



45 



5 .0 


3013 




78 


17 


5 .5 


7994 


± 


448 


45 


6 . 0 


12965 


± 


483 


74 


6 .5 


14390 


± 


1291 


82 


7 .0 


14386 


± 


1287 


82 


7.5 


14135 


+ 


198 


80 


B .0 


17582 


+ 


B31 


100 


8 . 5 


16183 


+ 


953 


92 


9.0 


16795 


± 


760 


96 


9.5 


16279 


± 


1022 


93 


10 . 0 


15225 


± 


210 


87 



background = about 2200) . - 
b Percent activity relative to the maximum at pH 8.0 
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Table 18 

Rffpct of Di fferent Protease Inhibitors on the Protease Activity at 
pH 8 Foun d in a Dav 1 Production of Photorhabdus luminescens 

(Strain W-14) 

5 

Inhibitor _ Corrected Flu. Units 3 Percent Inhibition 3 



Control 


13053 


0 


E-64 


14259 


0 


1,10 Phenanthroline c 


15 


99 


3,4 Dichloroisocoumarin^ 


7956 


39 


Leupeptin 


13074 


0 


Pepstatin c 


13441 


0 


Amastatin 


12474 


4 


DMSO Control 


12005 


8 


Methanol Control 


12125 


7 


a corrected flu. units 


= Fluorescence units - 


- DacKgrouna i^uu 



flu. units) . 

b Percent Inhibition relative to protease activity at pH 8.0. 
2 0 c Inhibitors were dissolved in methanol, 

d Inhibitors were dissolved in DMSO. 

The isolation of a zinc-metalloprotease was performed by 
applying dialyzed 10-80% ammonium sulfate pellet to a Q Sepharose 
25 column equilibrated at 50 mM Na2PC>4, pH 7.0 as described in Example 

5 for Photorhabdus toxin. After extensive washing, a 0 to 0.5 M 
NaCl gradient was used to elute toxin protein. The majority of 
biological activity and protein was eluted from 0.15 - 0.45 M NaCl . 
However, it was observed that the majority of proteolytic activity 

30 was present in the 0.25-0.35 M NaCl fraction with some activity in 
the 0.15-0.25 M NaCl fraction. SDS PAGE analysis of the 0.25-0.35 
M NaCl fraction showed a major peptide band of approximately 60 
kDa. The 0.15-0.25 M NaCl fraction contained a similar 60 kDa band 
but at lower relative protein concentration. Subsequent gel 

35 filtration of this fraction using a Superose 12 HR 16/50 column 
resulted in a major peak migrating at 57.5 kDa that contained a 
predominant (> 90% of total stained protein) 58.5 kDa band by SDS 
PAGE analysis. Additional analysis of this fraction using various 
protease inhibitors as described above determined that the protease 

4 0 was a zinc-metalloprotease. Nearly all of the protease activity 

present in Photorhabdus broth at day 1 of fermentation corresponded 
to the about 58 kDa zinc-metalloprotease. 

In yet a second isolation of zinc-metalloprotease (s) , w-14 
Photorhabdus broth grown for three days was taken and protease 

4 5 activity was visualized using sodium dodecyl sulf ate-polyacrylamide 
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gel electrophoresis (SDS-PAGE) laced with gelatin as described in 
Schmidt, T.M., Bleakley, B. and Nealson, K.M. 198B . SDS running 
gels (5.5 x 8 cm) were made with 12.5 % polyacrylamide (4 0% stock 
solution of aery lamide /bis -acrylamide; Sigma Chemical Co., St. 
5 Louis, MO) into which 0.1% gelatin final concentration (Biorad EIA 
grade reagent; Richmond CA) was incorporated upon dissolving in 
water. SDS -stacking gels (1.0 x 8 cm) were made with 5% 
polyacrylamide, also laced with 0.1% gelatin. Typically, 2.5 ug of 
protein to be tested was diluted in 0.03 ml of SDS - PAGE loading 
10 buffer without dithiothreitol (DTT) and loaded onto the gel. 

Proteins were electrophoresed in SDS running buffer (Laemmli, U.K. 
1970. Nature 227, 680) at 0° C and at 8 mA. After electrophoresis 
was complete, the gel was washed for 2 h in 2.5% (v/v) Triton X- 
100. Gels were then incubated for 1 h at 37 °C in 0.1 M glycine 
15 <pH 8.0). After incubation, gels were fixed and stained overnight 
with 0.1% amido black in methanol -acetic acid- water (30:10:60, 
vol . /vol . /vol . ; Sigma Chemical Co.). Protease activity was 
visualized as light areas against a dark, amido black stained 
background due to proteolysis and subsequent diffusion of 
2 0 incorporated gelatin. At least three distinct bands produced by 
proteolytic activity at 58-, 41-, and 38 kDa were observed. 

Activity assays of the different proteases in W-14 day three 
culture broth were performed using FITC-casein dissolved in water 
as substrate (0.02% final assay concentration). Proteolysis 

2 5 experiments were performed at 37°C for 0-0.5 h in 0 . 1M Tris-HCl (pH 

8.0) with different protein fractions in a total volume of 0.15 ml. 
Reactions were terminated by addition of an equal volume of 12% 
trichloroacetic acid (TCA) dissolved in water. After incubation at 
room temperature for 0.25 h, samples were centrifuged at 10,000 x g 
30 for 0.25 h and 0.10 ml aliquots were removed and placed into 96- 
well microtiter plates. The solution was then neutralized by the 
addition of an equal volume of 2 N sodium hydroxide, followed by 
quantitation using a Fluoroskan II fluorometric plate reader with 
excitation and emission wavelengths of 485 and 538 nm, 

3 5 respectively. Activity measurements were performed using FITC- 

Casein with different protease concentrations at 37°C for 0-10 min. 
A unit of activity was arbitrarily defined as the amount of enzyme 
needed to produce 1000 fluorescent units/min and specific activity 
was defined as units/mg of protease. 
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Inhibition studies were performed using two »ffc- 
metalloprotease inhibitors; 1,10 phenanthroline and N- (a- 
rhamnopyranosyloxyhydroxyphosphinyl ) -Leu-Trp ( phosphor amidon) with 
stock solutions of the inhibitors dissolved in 100% ethanol and 
water, respectively. Stock concentrations were typically 10 mg/ml 
and 5 mg/ml for 1,10 phenanthroline and phosphoramidon , 
respectively, with final concentrations of inhibitor at 0.5-1.0 
mg/ml per reaction. Treatment of three day W-14 crude broth with 
1,10 phenanthroline, an inhibitor of all zinc metalloproteases , 
resulted in complete elimination of all protease activity while 
treatment with phosphoramidon, an inhibitor of thermolysin- like 
proteases (Weaver, L.H., Kester, W.R., and Matthews, B.W. 1977. J. 
Mol. Biol. 114, 119-132), resulted in about 56% reduction of 
protease activity. The residual proteolytic activity could not be 
further reduced with additional phosphoramidon. 

The proteases of three day W-14 Photorhabdus broth were 
purified as follows: 4.0 liters of broth were concentrated using 
an Amicon spiral ultra filtration cartridge Type S1Y100 attached to 
an Amicon M-12 filtration device. The flow- through material having 
native proteins less than 100 kDa in size (3.8 L) was concentrated 
to 0.375 L using an Amicon spiral ultra filtration cartridge Type 
S1Y10 attached to an Amicon M-12 filtration device. The retentate 
material contained proteins ranging in size from 10-100 kDa. This 
material was loaded onto a Pharmacia HR16/10 column which had been 
packed with PerSeptive Biosystem (Framington, MA) Poros® 50 HQ 
strong anion exchange packing that had been equilibrated in 10 mM 
sodium phosphate buffer (pH 7.0) . Proteins were loaded on the 
column at a flow rate of 5 ml/min, followed by washing unbound 
protein with buffer until A28O = 0-00. Afterwards, proteins were 

eluted using a NaCl gradient of 0-1.0 M NaCl in 40 min at a flow 
rate of 7.5 ml/min. Fractions were assayed for protease activity, 
supra., and active fractions were pooled. Proteolyt ical ly active 
fractions were diluted with 50% (v/v) 10 mM sodium phosphate buffer 
(pH 7.0) and loaded onto a Pharmacia HR 10/10 Mono Q column 
equilibrated in 10 mM sodium phosphate. After washing the column 
with buffer until A28O = 0.00, proteins were eluted using a NaCl 

gradient of 0-0.5 M NaCl for 1 h at a flow rate of 2.0 ml/min. 
Fractions were assayed for protease activity. Those fractions 
having the greatest amount of phosphoramidon- sensi tive protease 
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activity, th^phosphoramidon sensitive activity being due to the 
41/38 kDa protease, infra., were pooled- These fractions were 
found to elute at a range of 0.15-0.25 M NaCl . Fractions 
containing a predominance of phosphor amidon- insensitive protease 
5 activity, the 58 kDa protease, were also pooled. These fractions 
were found to elute at a range of 0.25-0.35 M NaCl . The 
phosphor atnidon- sensitive protease fractions were then concentrated 
to a final volume of 0.75 ml using a Millipore Ultraf ree®-15 
centrifugal filter device Biomax-5K NMWL membrane. This material 

10 was applied at a flow rate of 0.5 ml/min to a Pharmacia HR 10/30 
column that had been packed with Pharmacia Sephadex G-50 
equilibrated in 10 mM sodium phosphate buffer (pH 7.0)/ 0.1 M NaCl . 
Fractions having the maximal phosphor ami don- sensitive protease 
activity were then pooled and centrifuged over a Millipore 

15 Ultraf ree®-15 centrifugal filter device Biomax-50K NMWL membrane. 
Proteolytic activity analysis, supra., indicated this material to 
have only phosphor ami don- sensitive protease activity. Pooling of 
the phosphoramidon- insensitive protease, the 58 kDa protein, was 
followed by concentrating in a Millipore Ultraf ree®- 15 centrifugal 

20 filter device Biomax-5 0K NMWL membrane and further separation on a 
Pharmacia Superdex-75 column. Fractions containing the protease 
were pooled. 

Analysis of purified 58- and 41/38 kDa purified proteases 
revealed that, while both types of protease were completely 

2 5 inhibited with 1,10 phenanthroline , only the 41/3 8 kDa protease was 
inhibited with phosphoramidon. Further analysis of crude broth 
indicated that protease activity of day 1 W-14 broth has 23% of the 
total protease activity due to the 41/38 kDa protease, increasing 
to 44% in day three W-14 broth. 

30 Standard SDS-PAGE analysis for examining protein purity and 

obtaining amino terminal sequence was performed using 4-20% 
gradient MiniPlus SepraGels purchased from Integrated Separation 
Systems (Natick, MA) . Proteins to be amino- terminal sequenced were 
blotted onto PVDF membrane following purification, infra., 

35 (ProBlott™ Membranes; Applied Biosystems, Foster City, CA) , 

visualized with 0.1% amido black, excised, and sent to Cambridge 
Prochem; Cambridge, MA, for sequencing. 

Deduced amino terminal sequence of the 58- (SEQ ID NO: 45) and 
41/: kDa (SEQ ID N0:44) proteases from three day old W-14 broth 
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Q ID NO: 45) and DSGDDDKVTN'IWPHR (SEQ ID 

NO: 44) , respectively. 

Sequencing of the 41/3 8 kDa protease revealed several amino 
termini, each one having an additional amino acid removed by 
5 proteolysis. Examination of the primary, secondary, tertiary and 
quartenary sequences for the 38 and 41 kDa polypeptides allowed for 
deduction of the sequence shown above and revealed that these two 
proteases are homologous . 

10 Example 11. Part A 

^rrPPnma of Phntor h abdus Genomic Library Via Use of Antibod i es tQX 

Genes Encoding TcbA Peptide 

In parallel to the sequencing described above, suitable 
15 probing and sequencing was done based on the TcbAii peptide (SEQ ID 
NO:l) . This sequencing was performed by preparing bacterial 
culture broths and purifying the toxin as described in Examples 1 
and 2 above . 

Genomic DNA was isolated from the Photorhabdus luminescens 

20 strain W-14 grown in Grace's insect tissue culture medium. The 
bacteria were grown in 5 ml of culture medium in a 250 ml 
Erlenmeyer flask at 28°C and 250 rpm for approximately 24 hours. 
Bacterial cells from 100 ml of culture medium were pelleted at 5000 
x g for 10 minutes. The supernatant was discarded, and the cell 

2 5 pellets then were used for the genomic DNA isolation. 

The genomic DNA was isolated using a modification of the CTAB 
method described in Section 2.4.3 of Ausubel (supra.). The section 
entitled "Large Scale CsCl prep of bacterial genomic DNA" was 
followed through step 6. At this point, an additional 

30 chloroform/isoamyl alcohol (24:1) extraction was performed followed 
by a phenol/chloroform/ isoamyl (25:24:1) extraction step and a 
final chloroform/isoamyl/alcohol (24:1) extraction. The DNA was 
precipitated by the addition of a 0.6 volume of isopropanol . The 
precipitated DNA was hooked and wound around the end of a bent 

35 glass rod, dipped briefly into 70% ethanol as a final wash, and 
dissolved in 3 ml of TE buffer. 

The DNA concentration, estimated by optical density at 280/260 
nm, was approximately 2 mg/ml . 

Using this genomic DNA, a library was prepared. Approximately 

4 0 50 jxg of genomic DNA was partly digested with Sau3 Al . Then NaCl 
density gradient centrif ugat ion was used to size fractionate the 
partially digested DNA fragments. Fractions containing DNA 

-77- 

SUBSTUUTE SHEET (RULE 26) 



SDOCID: <WO 9808932A1.I_> 



WO 98/08932 PCT/US97/07657 



W^^L 



fragments wTTh an average size of 12 kb, or larger, as determined 
by agarose gel electrophoresis, were ligated into the plasmid 
BluScript, Stratagene, La Jolla, California, and transformed into 
an E. coli DH5a or DHB10 strain. 
5 Separately, purified aliquots of the protein were sent to the 

biotechnology hybridoma center at the University of Wisconsin, 
Madison for production of monoclonal antibodies to the proteins. 
The material that was sent was the HPLC purified fraction 
containing native bands 1 and 2 which had been denatured at 65 °C, 
10 and 20 jig of which was injected into each of four mice. Stable 

monoclonal antibody-producing hybridoma cell lines were recovered 
after spleen cells from unimmunized mouse were fused with a stable 
myeloma cell line. Monoclonal antibodies were recovered from the 
hybridomas . 

15 Separately, polyclonal antibodies were created by taking 

native agarose gel purified band 1 (see Example 1) protein which 
was then used to immunize a New Zealand white rabbit. The protein 
was prepared by excising the band from the native agarose gels, 
briefly heating the gel pieces to 65°C to melt the agarose, and 

20 immediately emulsifying with adjuvant. Freund's complete adjuvant 
was used for the primary immunizations and Freund's incomplete was 
used for 3 additional injections at monthly intervals. For each 
injection, approximately 0.2 ml of emulsified band 1, containing 50 
to 100 micrograms of protein, was delivered by multiple 

25 subcontaneous injections into the back of the rabbit. Serum was 
obtained 10 days after the final injection and additional bleeds 
were performed at weekly intervals for 3 weeks. The serum 
complement was inactivated by heating to 56°C for 15 minutes and 
then stored at -20°C. 

30 The monoclonal and polyclonal antibodies were then used to 

screen the genomic library for the expression of antigens which 
could be detected by the epitope. Positive clones were detected on 
nitrocellulose filter colony lifts. An immunoblot analysis of the 
positive clones was undertaken. 

3 5 An analysis of the clones as defined by both immunoblot and 

Southern analysis resulted in the tentative identification of four 
genomic regions. 

In the first region was a gene encoding the peptide designated 
here as TcbAii- Full DNA sequence of this gene ( tcbA) was 

40 obtained. It is set forth as SEQ ID NO: 11. Confirmation that the 
sequence encodes the internal sequence of SEQ ID NO : 1 is 
demonstrated by the presence of SEQ ID NO : 1 at amino acid number 88 
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n^^c^ acid sequence created bv t^fc>r 



from Che deduced amnio acid sequence created by tMRpen reading 
frame of SEQ ID NO: 11- This can be confirmed by referring to SEQ 
ID NO: 12, which is the deduced amino acid sequence created by SEQ 
ID NO: 11. 

5 The second region of toxin peptides contains the segments 

referred to above as TcaBi, TcaBii and TcaC. Following the 
screening of the library with the polyclonal antisera, this second 
region of toxin genes was identified by several clones which 
produced different size proteins, all of which cross -reacted with 
10 the polyclonal antibody on an irranunoblot and were also found to 
share DNA homology on a Southern Blot. Sequence comparison 
revealed that they belonged to the gene complex designated TcaB and 
TcaC above. 

Two other regions of antibody toxin clones were also isolated 
15 in the polyclonal screen. These regions produced proteins that 

cross -react with a polyclonal antibody and also shared DNA homology 
with the regions as determined by Southern blotting. Thus, it 
appears that the Photorhabdus luminescens extracellular protein 
genes represent a family of genes which are evolutionari ly related. 
2 0 To further pursue the concept that there might be 

evolutionarily related variations in the toxin peptides contained 
within this organism, two approaches have been undertaken to 
examine other strains of Photorhabdus luminescens for the presence 
of related proteins. This was done both by PCR amplification of 
2 5 genomic DNA and by immunoblot analysis using the polyclonal and 
monoclonal antibodies . 

The results indicate that related proteins are produced by 
Photorhabdus. luminescens strains WX-2, WX-3, WX-4, WX-5, WX-6, WX- 
1, WX-8, WX-11, WX-12, WX-15 and W-14. 



Example 11. Part B 
Sequence and A nalysis of tec Toxin Clones 



Further DNA sequencing was performed on plasmids isolated from 
35 E . coli clones described in Example 11, Part A. The nucleotide 
sequence from the third region of E. coli clones was shown to be 
three closely linked open reading frames at this genomic locus. 
This locus was designated tec with the three open reading frames 
designated tccA SEQ ID NO: 56, tccB SEQ ID NO: 58 and tccC SEQ ID 
40 NO.-60. The close linkage between these open reading frames is 

revealed by examination of SEQ ID NO: 56, in which 93 bp separate 
the stop codon of tccA from the start codon of tccb (bases 2 992- 
2994 of SEQ ID N0:56), and by examination of SEQ ID NO:58, in which 
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131 bases separate the stop codon of tccB and the tccC (bases 4930- 
4 932 of SEQ ID NO: 58) . The physical map is presented in Fig. 6B . 

The deduced amino acid sequence from the tccA open reading 
frame indicates that the gene encodes a protein of 105,459 Da. 
5 This protein was designated TccA (SEQ ID NO;57) . The first 12 

amino acids of this protein match the N-terminal sequence obtained 
from a 108 kDa protein, SEQ ID NO: 8, previously identified as part 
of the toxin complex. 

The deduced amino acid sequence from the tccB open reading 

10 frame indicates that this gene encodes a protein of 175,716 Da. 
This protein was designated TccB (SEQ ID NO: 59) . The first 11 
amino acids of this protein match the N-terminal sequence obtained 
from a protein with estimated molecular weight of 185 kDa, SEQ ID 
N0:7. Similarity analysis revealed that the TccB protein is related 

15 to the proteins identified as TcbA SEQ ID NO: 12; 37% similarity and 
28% identity, TcdA SEQ ID NO:47; 35% similarity and 28%identity, 
and TcaB SEQ ID NO: 26; 32% similarity and 26% identity (using the 
GAP algorithm Wisconsin Package Version 9.0, Genetics Computer 
Group (GCG) Madison Wisconsin) . 

20 The deduced amino acid sequence of tccC indicated that this 

open reading frame encodes a protein of 111,694 Da and the protein 
product was designated TccC (SEQ ID NO: 61) . 

Example 12 

25 Characters sat ion of Pftotortofrfas Strains 

In order to establish that the collection described herein was 
comprised of Photorhabdus strains, the strains herein were assessed 
in terms of recognized microbiological traits that are 
30 characteristic of Photorhabdus and which differentiate it from 

other Enterobacteriaceae and Xenorhabdus spp. (Farmer, J . J. 1984. 
Bergey's Manual of Systemic Bacteriology, Vol 1. pp. 510-511. (ed. 
Kreig N . R. and Holt, J. G.). Williams & Wilkins, Baltimore; 
Akhurst and Boemare, 1988, Boemare et al . , 1993). These 

3 5 characteristic traits are as follows: Gram's stain negative rods, 

organism size of 0.5-2 /im in width and 2-10 /am in length, 
red/yellow colony pigmentation, presence of crystalline inclusion 
bodies, presence of catalase, inability to reduce nitrate, presence 
of bioluminescence, ability to take up dye from growth media, 

4 0 positive for protease production, growth- temperature range below 

37°C, survival under anaerobic conditions and positively motile. 
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(Table 20) . Referem 




Escherichia coli, Xenorhabd] 




nd 



10 



15 



20 



25 



30 



Photorhabdus strains were included in all tests for comparison. 
The overall results are consistent with all strains being part of 
the family Enterobacteriaceae and the genus Photorhabdus. 

A luminometer was used to establish the bioluminescence of 
each strain and provide a quantitative and relative measurement of 
light production. For measurement of relative light emitting 
units, the broths from each strain (cells and media) were measured 
at three time intervals after inoculation in liquid culture (6, 12, 
and 24 hr) and compared to background luminosity (uninoculated 
media and water) . Prior to measuring light emission from the 
various broths, cell density was established by measuring light 
absorbance (560 nM) in a Gilford Systems (Oberlin, OH) 
spectrophotometer using a sipper cell. Appropriate dilutions were 
then made (to normalize optical density to 1.0 unit) before 
measuring luminosity. Aliquots of the diluted broths were then 
placed into cuvettes (300 pi each) and read in a Bio-Orbit 1251 
Luminometer (Bio-Orbit Oy, Twiku, Finland) . The integration period 
for each sample was 45 seconds. The samples were continuously 
mixed (spun in baffled cuvettes) while being read to provide oxygen 
availability. A positive test was determined as being > 5- fold 
background luminescence (about 5-10 units) . In addition, colony 
luminosity was detected with photographic film overlays and 
visually, after adaptation in a darkroom. The Gram's staining 
characteristics of each strain were established with a commercial 
Gram's stain kit (BBL, Cockeysville , MD) used in conjunction with 
Gram's stain control slides (Fisher Scientific, Pittsburgh, PA). 
Microscopic evaluation was then performed using a Zeiss microscope 
(Carl Zeiss, Germany) 100X oil immersion objective lens (with 10X 
ocular and 2X body magnification) . Microscopic examination of 
individual strains for organism size, cellular description and 
inclusion bodies (the latter after logarithmic growth) was 
performed using wet mount slides (10X ocular, 2X body and 40X 
objective magnification) with oil immersion and phase contrast 
microscopy with a micrometer (Akhurst, R.J. and Boemare , N.E. 1990. 
Entomonathoqenic Nematodes in Biologic al Control (ed. Gaugler, R. 
and Kaya, H.). pp. 75-90. CRC Press, Boca Raton, USA.; Baghdiguian 
S., Boyer-Giglio M.H., Thaler, J.O., Bonnot G., Boemare N. 1993. 
Biol. Cell 79, 177-185.). Colony pigmentation was observed after 
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inoculation^Pfiacto nutrient agar, .(Difco Laboratories, Detroit, 
MI) prepared as per label instructions. Incubation occurred at 
28 °C and descriptions were produced after 5-7 days. To test for 
the presence of the enzyme catalase, a colony of the test organism 
5 was removed on a small plug from a nutrient agar plate and placed 
into the bottom of a glass test tube. One ml of a household 
hydrogen peroxide solution was gently added down the side of the 
tube. A positive reaction was recorded when bubbles of gas 
(presumptive oxygen) appeared immediately or within 5 seconds. 

10 Controls of uninoculated nutrient agar and hydrogen peroxide 

solution were also examined. To test for nitrate reduction, each 
culture was inoculated into 10 ml of Bacto Nitrate Broth (Difco 
Laboratories, Detroit, MI) . After 2 4 hours incubation at 28°C, 
nitrite production was tested by the addition of two drops of 

15 sulfanilic acid reagent and two drops of alpha -naphthylamine 
reagent (see Difco Manual, 10th edition, Difco Laboratories, 
Detroit, MI, 1984) . The generation of a distinct pink or red color 
indicates the formation of nitrite from nitrate. The ability of 
each strain to uptake dye from growth media was tested with Bacto 

20 MacConkey agar containing the dye neutral red; Bacto Tergitol-7 
agar containing the dye bromothymol blue and Bacto EMB Agar 
containing the dye eosin-Y (agars from Difco Laboratories, Detroit, 
MI, all prepared according to label instructions). After 
inoculation on these media, dye uptake was recorded after 

2 5 incubation at 28°C for 5 days. Growth on these latter media is 
characteristic for members of the family Enterobacteriaceae . 
Motility of each strain was tested using a solution of Bacto 
Motility Test Medium (Difco Laboratories, Detroit, MI) prepared as 
per label instructions. A butt-stab inoculation was performed with 

30 each strain and motility was judged macroscopically by a diffuse 

zone of growth spreading from the line of inoculum. In many cases, 
motility was also observed microscopically from liquid culture 
under wet mount slides. Biochemical nutrient evaluation for each 
strain was performed using BBL Enterotube II (Benton, Dickinson, 

35 Germany) . Product instructions were followed with the exception 
that incubation was carried out at 28°C for 5 days. Results were 
consistent with previously cited reports for Photorhabdus . The 
production of protease was tested by observing hydrolysis of 
gelatin using Bacto gelatin (Difco Laboratories, Detroit, MI) 
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plates made as per 




>el instructions. 



Cultures 




inoculated 



10 



and the plates were incubated at 28°C for 5 days. To assess growth 
at different temperatures, agar plates [2% proteose peptone #3 with 
two percent Bacto-Agar (Difco, Detroit, MI) in deionized water] 
were streaked from a common source of inoculum. Plates were sealed 
with Nesco® film and incubated at 20, 28 and 37°C for up to three 
weeks. Plates showing no growth at 37°C showed no cell viability 
after transfer to a 28 °C incubator for one week. Oxygen 
requirements for Photorhabdus strains were tested in the following 
manner. A butt -stab inoculation into fluid thioglycolate broth 
medium (Difco, Detroit, MI) was made. The tubes were incubated at 
room temperature for one week and cultures were then examined for 
type and extent of growth. The indicator resazurin demonstrates 
the level of medium oxidation or the aerobiosis zone (Difco Manual, 
10th edition, Difco Laboratories, Detroit, MI). Growth zone 
results obtained for the Photorhabdus strains tested were 
consistent with those of a facultative anaerobic microorganism. 
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C=Bioluminescence, D=Cell form, E=Motility ( F=Nitrate reduction, 
G=Presence of catalase, H=Gelatin hydrolysis, I=Dye uptake, 
J= Pigmentation, K=Growth on EMB agar, L=Growth on MacConkey agar, 
M=Growth on Tergitol-7 agar, N=Facultative anaerobe, 0=Growth at' 
20°C, P=Growth at 28°C, Q=Growth at 37°C, 1 " + /~ = positive or 
negative for trait, rd=rod, S=sized within Genus descriptors, 
RO= red -orange, LR = light red, R= red, 0= orange, Y= yellow, T= 
tan, LY= light yellow, YT= yellow tan, and LO= light orange. 

Cellular fatty acid analysis is a recognized tool for 
bacterial characterization at the genus and species level 
(Tornabene, T. G. 1985. Upj d Analysis and the Relat inn.hi p fn 
ChemOtaxononiV in Methods in Mic robiology Vol. 18, 209-234.; 
Goodfellow, M. and O'Donnell, A. G. 1993. Roots of Bacterial 
SVStematics in Handbook of New Bacterial Sy s tematics (ed. 
Goodfellow, M. & O'Donnell, A. G.) pp. 3-54. London: Academic Press 
Ltd.), these references are incorporated herein by reference, and 
were used to confirm that our collection was related at the genus 
level. Cultures were shipped to an external, contract laboratory 
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for fatty acid methyl ester analysis- (FAME) using 




icrobial ID 



(MIDI, Newark, DE, USA) Microbial Identification System (MIS) . The 
MIS system consists of a Hewlett Packard HP5890A gas chromatograph 
with a 25mm x 0.2mm 5% methylphenyl silicone fused silica capillary 
5 column. Hydrogen is used as the carrier gas and a flame -ionization 
detector functions in conjunction with an automatic sampler, 
integrator and computer. The computer compares the sample fatty 
acid methyl esters to a microbial fatty acid library and against a 
calibration mix of known fatty acids. As selected by the contract 

10 laboratory, strains were grown for 24 hours at 28°C on trypticase 

soy agar prior to analysis. Extraction of samples was performed by 
the contract lab as per standard FAME methodology. There was no 
direct identification of the strains to any luminescent bacterial 
group other than PhotorhaJbdus . When the cluster analysis was 

15 performed, which compares the fatty acid profiles of a group of 

isolates, the strain fatty acid profiles were related at the genus 
level . 

The evolutionary diversity of the Photorhabdus strains in our 
collection was measured by analysis of PCR (Polymerase Chain 

2 0 Reaction) mediated genomic fingerprinting using genomic DNA from 

each strain. This technique is based on families of repetitive DNA 
sequences present throughout the genome of diverse bacterial 
species (reviewed by Versalovic, J., Schneider, M., DE Bruijn, 
F. J . and Lupski , J. R. 1994. Methods Mol . Cell. Biol., 5, 25-40.). 

25 Three of these, repetitive extragenic palindromic sequence (REP), 
enterobacterial repetitive intergenic consensus (ERIC) and the BOX 
element are thought to play an important role in the organization 
of the bacterial genome. Genomic organization is believed to be 
shaped by selection and the differential dispersion of these 

30 elements within the genome of closely related bacterial strains can 
be used to discriminate these strains (e.g., Louws, F. J., 
Fulbright, D. W., Stephens, C. T. and DE Bruijn, F. J. 1994. Appl . 
Environ. Micro. 60, 2286-2295). Rep-PCR utilizes oligonucleotide 
primers complementary to these repetitive sequences to amplify the 

3 5 variably sized DNA fragments lying between them. The resulting 

products are separated by electrophoresis to establish the DNA 
"fingerprint" for each strain. 

To isolate genomic DNA from our strains, cell pellets were 
resuspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0) to a 
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final volum^^f 10 ml and 12 ml of 5 M NaCl was then added. This 
mixture was centrifuged 20 min. at 15,000 x g. The resulting 
pellet was resuspended in 5 . 7 ml of TE and 3 00 /il of 10% SDS and 60 
Ml 20 mg/ml proteinase K (Gibco BRL Products, Grand Island, NY) 
5 were added. This mixture was incubated at 37 °C for 1 hr, 

approximately 10 mg of lysozyme was then added and the mixture was 
incubated for an additional 45 min. One milliliter of 5M NaCl and 
8 00 fil of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were then 
added and the mixture was incubated 10 min. at 6 5°C, gently 

10 agitated, then incubated and agitated for an additional 20 min. to 
aid in clearing of the cellular material . An equal volume of 
chloroform/ isoamyl alcohol solution (24:1, v/v) was added, mixed 
gently then centrifuged. Two extractions were then performed with 
an equal volume of phenol /chloroform/ isoamyl alcohol (50:49:1). 

15 Genomic DNA was precipitated with 0.6 volume of isopropanol . 

Precipitated DNA was removed with a glass rod, washed twice with 
70% ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl 
pH8.0, 10 mM NaCl, 1 mM EDTA) . The DNA was then quantitated by 
optical density at 260 nm. To perform rep-PCR analysis of 

20 Photorhabdus genomic DNA the following primers were used, REP1R-I; 

5 ' -IIIICGICGICATCIGGC-3 ' and REP2-I; 5 ' - ICGICTTATCIGGCCTAC-3 ' . PCR 
was performed using the following 25/zl reaction: 7.75 /zl H2O, 2.5 

/il 10X LA buffer (PanVera Corp., Madison, WI), 16 /il dNTP mix (2.5 
mM each), 1 /il of each primer at 50 pM/^tl, 1 fil DMSO, 1.5 ^1 

25 genomic DNA (concentrations ranged from 0.075-0.480 jig/ jxl ) and 0.25 
/il TaKaRa EX Taq (PanVera Corp., Madison, WI). The PCR 
amplif icat ion was performed in a Perkin Elmer DNA Thermal Cycler 
(Norwalk, CT) using the following conditions: 95°C/7 min. then 35 
cycles of; 94°C/1 min.,44°C/l min., 65°C/8 min., followed by 15 min. 

30 at 65°C. After cycling, the 25 /il reaction was added to 5 /il of 6X 
gel loading buffer (0.25% bromophenol blue, 40% w/v sucrose in 
H2O) . A 15x20cm 1%-agarose gel was then run in TBE buffer (0.09 M 

Tris -borate, 0.002 M EDTA) using 8 /il of each reaction. The gel 
was run for approximately 16 hours at 45v. Gels were then stained 
3 5 in 20 ptg/ml ethidium bromide for 1 hour and destained in TBE buffer 
for approximately 3 hours. Polaroid® photographs of the gels were 
then taken under UV illumination. 

The presence or absence of bands at specific sizes for each 
strain was scored from the photographs and entered as a similarity 
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« 



matrix in the numeri 




taxonomy software program 




!YS-pc (Exeter 



Software, Setauket, NY) . Controls of E. coli strain HB101 and 
Xanthomonas oryzae pv. oryzae assayed at the same time produced PCR 
"fingerprints" corresponding to published reports (Versalovic, J., 
5 Koeuth, T. and Lupski , J. R. 1991. Nucleic Acids Res. 19, 6823- 

6831/ Vera Cruz, C. M . , Halda-Alija, L . , Louws, F . , Skinner, D. Z., 
George, M . L . , Nelson, R. J., DE Bruijn, F. J., Rice, C. and Leach, 
J. E . 1995. Int. Rice Res. Notes, 20, 23-24.; Vera Cruz, C. M . , 
Ardales, E. Y . , Skinner, D. Z., Talag, J., Nelson, R. J., Louws, 

10 F. J., Leung, H. , Mew, T. W. and Leach, J. E. 1996. Phytopathology 
(in press, respectively) - The data from Photorhabdus strains were 
then analyzed with a series of programs within NTSYS-pc; SIMQUAL 
(Similarity for Qualitative data) to generate a matrix of 
similarity coefficients (using the Jaccard coefficient) and SAHN 

15 (Sequential, Agglomerative , Heirarchical and Nested) clustering 
[using the UPGMA (Unweighted Pair-Group Method with Arithmetic 
Averages) method] which groups related strains and can be expressed 
as a phenogram (Fig. 5) . The COPH (cophenetic values) and MXCOMP 
(matrix comparison) programs were used to generate a cophenetic 

20 value matrix and compare the correlation between this and the 

original matrix upon which the clustering was based. A resulting 
normalized Mantel statistic (r) was generated which is a measure of 
the goodness of fit for a cluster analysis (r=0.8-0.9 represents a 
very good fit) . In our case r = 0.919. Therefore, our collection 

2 5 is comprised of a diverse group of easily distinguishable strains 
representative of the PhotorhaJbdus genus. 



Initial "seed" cultures of the various Photorhabdus strains 
were produced by inoculating 175 ml of 2% Proteose Peptone #3 (PP3) 
(Difco Laboratories, Detroit, MI) liquid media with a primary 
3 5 variant subclone in a 500 ml tribaffled flask with a Delong neck, 
covered with a Kaput. Inoculum for each seed culture was derived 
from oil-overlay agar slant cultures or plate cultures. After 
inoculation, these flasks were incubated for 16 hrs at 28°C on a 
rotary shaker at 150 rpm. These seed cultures were then used as 



Example 13 

Insecticidal Utility of Toxin (s) Produced 

by Various Photorhabdus Strains 
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uniform inoctriv^m sources for a given fermentation of each strain. 
Additionally, overlaying the post -log seed culture with sterile 
mineral oil, adding a sterile magnetic stir bar for future 
resuspension and storing the culture in the dark, at room 
5 temperature provided long-term preservation of inoculum in a toxin- 
competent state. The production broths were inoculated by adding 
1% of the actively growing seed culture to fresh 2% PP3 media 
(e.g., 1,75 ml per 175 ml fresh media). Production of broths 
occurred in either 500 ml tribaffled flasks (see above) , or 2800 ml 
10 baffled, convex bottom flasks (500 ml volume) covered by a silicon 
foam closure. Production flasks were incubated for 24-48 hrs under 
the above mentioned conditions. Following incubation, the broths 
were dispensed into sterile 1 L polyethylene bottles, spun at 2600 
x g for 1 hr at 10°C and decanted from the cell and debris pellet. 
15 The liquid broth was then vacuum filtered through Whatman GF/D (2.7 
retention) and GF/B (1.0 /iM retention) glass filters to remove 
debris. Further broth clarification was achieved with a tangential 
flow microf iltration device (Pall Filtron, Northborough, MA) using 
a 0 . 5 fiM open-channel filter. When necessary, additional 
20 clarification could be obtained by chilling the broth (to 4°C) and 
centrifuging for several hours at 2600 x g. Following these 
procedures, the broth was filter sterilized using a 0.2 fiM 
nitrocellulose membrane filter. Sterile broths were then used 
directly for biological assay, biochemical analysis or concentrated 
25 (up to 15-fold) using a 10,000 MW cut-off, M12 ultra- f iltration 
device (Amicon, Beverly MA) or centrifugal concentrators 
(Millipore, Bedford, MA and Pall Filtron, Northborough, MA) with a 
10,000 MW pore size. In the case of centrifugal concentrators, the 
broth was spun at 2000 x g for approximately 2 hr . The 10,000 MW 
3 0 permeate was added to the corresponding retentate to achieve the 
desired concentration of components greater than 10,000 MW . Heat 
inactivation of processed broth samples was acheived by heating the 
samples at 100°C in a sand- filled heat block for 10 minutes. 



3 5 strains are useful for reducing populations of insects and were 

used in a method of inhibiting an insect population which comprises 
applying to a locus of the insect an effective insect inactivating 
amount of the active described. A demonstration of the breadth of 
insecticidal activity observed from broths of a selected group of 



The broth (s) and toxin complex (es) from different Photorhabdus 
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M 



Photorhabdus scrairi; 




ermented as described above 



shown in Table 



10 



20 



25 



30 



20. It is possible that additional insecticidal activities could 
be detected with these strains through increased concentration of 
the broth or by employing different fermentation methods. 
Consistent with the activity being associated with a protein, the 
insecticidal activity of all strains tested was heat labile (see 
above) . 

Culture broth (s) from diverse Photorhabdus strains show 
differential insecticidal activity (mortality and/or growth 
inhibition, reduced adult emergence) against a number of insects. 
More specifically, the activity is seen against corn rootworm 
larvae and boll weevil larvae which are members of the insect order 
Coleoptera. Other members of the Coleoptera include wireworms, 
pollen beetles, flea beetles, seed beetles and Colorado potato 
beetle. Activity is also observed against aster leaf hopper and 
corn plant hopper, which are members of the order Hoinoptera . Other 
members of the Homoptera include planthoppers , pear psylla, apple 
sucker, scale insects, whiteflies, spittle bugs as well as numerous 
host specific aphid species. The broths and purified toxin 
complex (es) are also active against tobacco budworm, tobacco 
hornworm and European corn borer which are members of the order 
Lepidoptera . Other typical members of this order are beet 
armyworm, cabbage looper, black cutworm, corn earworm, codling 
moth, clothes moth, Indian mealmoth, leaf rollers, cabbage worm, 
cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm and 
fall armyworm. Activity is also seen against fruitfly and mosquito 
larvae which are members of the order Diptera. Other members of 
the order Diptera are, pea midge, carrot fly, cabbage root fly, 
turnip root fly, onion fly, crane fly and house fly and various 
mosquito species. Activity with broth (s) and toxin complex (es) is 
also seen against two-spotted spider mite which is a member of the 
order Acarina which includes strawberry spider mites, broad mites, 
citrus red mice, European red mite, pear rust mite and tomato 
russet mite. 

Activity against corn rootworm larvae was tested as follows. 
Photorhabdus culture broth (s) (0-15 fold concentrated, filter 
sterilized), 2% Proteose Peptone #3, purified toxin complex(es), 10 
mM sodium phosphate buffer , pH 7 . 0 were applied directly to the 
surface (about 1.5 cm 2 ) of artificial diet (Rose, R. I. and McCabe , 
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J. M. (1973) . Econ. Entomol . 66, .(398-400) in 40 £zl aliquots. 
Toxin complex was diluted in 10 mM sodium phosphate buffer, pH 7.0. 
The diet plates were allowed to air-dry in a sterile flow-hood and 
the wells were infested with single, neonate Diabrotica 
undeclmpunctata howardi (Southern corn root worm, SCR) hatched from 
surface sterilized eggs. The plates were sealed, placed in a 
humidified growth chamber and maintained at 27°C for the 
appropriate period (3-5 days) . Mortality and larval weight 
determinations were then scored. Generally, 16 insects per 
treatment were used in all studies. Control mortality was 
generally less than 5%. 

Activity against boll weevil (Anthomonas grandis) was tested 
as follows. Concentrated (1-10 fold) PhotorhaJbdus broths, control 
medium (2% Proteose Peptone #3), purified toxin complex(es) [0.23 
mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied in 60 
^1 aliquots to the surface of 0.35 g of artificial diet (Stoneville 
Yellow lepidopteran diet) and allowed to dry. A single, 12-24 hr 
boll weevil larva was placed on the diet, and the wells were sealed 
and held at 25°C, 50% RH for 5 days. Mortality and larval weights 
were then assessed. Control mortality ranged between 0-13%. 

Activity against mosquito larvae was tested as follows. The 
assay was conducted in a 96 -well microti ter plate. Each well 
contained 200 /il of aqueous solution (10 -fold concentrated 
Photorhabdus culture broth (s) , control medium (2% Proteose Peptone 
#3), 10 mM sodium phosphate buffer, toxin complex(es) @ 0.23 mg/ml 
or H2O) and approximately 20, 1-day old larvae (Aedes aegypti) . 

There were 6 wells per treatment. The results were read at 3-4 
days after infestation. Control mortality was between 0-20%. 

Activity against fruitflies was tested as follows. Purchased 
Drosophila melanogaster medium was prepared using 50% dry medium 
and a 50% liquid of either water, control medium (2% Proteose 
Peptone #3), 10-fold concentrated Photorhabdus culture broth(s), 
purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate 
buffer , pH 7.0. This was accomplished by placing 4.0 ml of dry 
medium in each of 3 rearing vials per treatment and adding 4 . 0 ml 
of the appropriate liquid. Ten late instar Drosophila melanogaster 
maggots were then added to each 25 ml vial. The vials were held on 
a laboratory bench, at room temperature, under fluorescent ceiling 
lights. Pupal or adult counts were made after 15 days of exposure. 
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Adult emergence as 




ipared to water, and control 




urn (0-16% 



reduction) . 

Activity against aster leafhopper adults (Macrosteles 
severini) and corn planthopper nymphs (Peregrinus maidis) was 
5 tested with an ingestion assay designed to allow ingestion of the 
active without other external contact. The reservoir for the 
active/" food" solution is made by making 2 holes in the center of 
the bottom portion of a 35X10 mm Petri dish. A 2 inch Parafilm 
square is placed across the top of the dish and secured with an "O n 

10 ring. A 1 oz . plastic cup is then infested with approximately 7 
hoppers and the reservoir is placed on top of the cup, Parafilm 
down. The test solution is then added to the reservoir through the 
holes. In tests using 10 -fold concentrated Photorhabdus culture 
broth (s) , the broth and control medium (2% Proteose Peptone #3) 

15 were dialyzed against 10 mM sodium phosphate buffer, pH 7 . 0 and 
sucrose {to 5%) was added to the resulting solution to reduce 
control mortality. Purified toxin complex(es) [0.23 mg/ml] or 10 
mM sodium phosphate buffer, pH 7 . 0 was also tested. Mortality is 
reported at day 3. The assay was held in an incubator at 28°C, 70% 

20 RK with a 16/8 photoperiod. The assays were graded for mortality 
at 72 hours. Control mortality was less than 6%. 

Activity against lepidopteran larvae was tested as follows. 
Concentrated (10-fold) Photorhabdus culture broth(s), control 
medium (2% Proteose Peptone #3), purified toxin complex(es) [0.23 

25 mg/ml] or 10 mM sodium phosphate buffer, pH 7 . 0 were applied 

directly to the surface (about 1.5 cm 2 ) of standard artificial 
lepidopteran diet (Stoneville Yellow diet) in 40 /xl aliguots. The 
diet plates were allowed to air-dry in a sterile flow-hood and each 
well was infested with a single, neonate larva. European corn borer 

30 {Ostrinia nubllalis) and tobacco hornworm (Manduca sexta) eggs were 
obtained from commercial sources and hatched in-house, whereas 
tobacco budworm {Heliothls virescens) larvae were supplied 
internally. Following infestation with larvae, the diet plates 
were sealed, placed in a humidified growth chamber and maintained 

3 5 in the dark at 27 °C for the appropriate period. Mortality and 

weight determinations were scored at day 5. Generally, 16 insects 
per treatment were used in all studies. Control mortality 
generally ranged from about 4 to about 12.5% for control medium and 
was less than 10% for phosphate buffer. 
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Activr^^against two- spotted spider mite {Tetranychus urticae) 
was determined as follows. Young squash plants were trimmed to a 
single cotyledon and sprayed to run-off with 10 -fold concentrated 
broth (s) , control medium (2% Proteose Peptone #3), purified toxin 
5 complex(es), 10 mM sodium phosphate buffer, pH 7.0. After drying, 
the plants were infested with a mixed population of spider mites 
and held at lab temperature and humidity for 72 hr . Live mites 
were then counted to determine levels of control. 
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Table. 20 

QbSSCEgcj Tnsecticidal Spectrum of Broths from 
Different Photorhabdus Strains 





trH\J LL/X JlfllJLrwfu fcjwj.oJ-*i 


sensitive 




insect species 




W A - X 


3** 




4, 5 




6, 


7, 8 




M .A. 
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4 
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WY - A 
W A 
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i n 

1 u 
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WAD 


4 
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WY - 7 
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4, 
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7, 
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1 5 


wy -in 
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4, 
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2, 
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6, 7, 8, 9 
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5, 
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7, 
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UK 
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1, 
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H9 
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3, 
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5, 


6, 7, 8 




W-14 


1, 


2, 


3, 


4, 


5, 


6, 7, 8, 10 




ATCC 4 3 948 


4 














ATCC 43949 


4 












30 


ATCC 43950 


4 














ATCC 43951 


4 














ATCC 43952 


4 













* = > 25% mortality and/or growth inhibition vs. control 
** = 1; Tobacco budworm, 2; European corn borer, 3; 
35 Tobacco hornworm, 4; Southern corn rootworm, 5; 

Boll weevil, 6; Mosquito, 7; Fruit Fly, 8; 

Aster Leaf hopper, 9; Corn planthopper, 10; 

Two-spotted spider mite. 
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Example 14 
Non W-14 Photorhabdus Strains : 
Purification. Characterization and Activity Spectrum 

5 Purification 

The protocol, as follows, is similar to that developed for the 
purification of W-14 and was established based on purifying those 
fractions having the most activity against Southern corn root worm 
{ SCR) , as determined in bioassays (see Example 13) . Typically, 4- 

10 20 L of broth that had been filtered, as described in Example 13, 
were received and concentrated using an Amicon spiral ultra 
filtration cartridge Type S1Y100 attached to an Amicon M-12 
filtration device. The retentate contained native proteins 
consisting of molecular sizes greater than 100 kDa, whereas the 

15 flow through material contained native proteins less than 100 kDa 

in size. The majority of the activity against SCR was contained in 
the 100 kDa retentate. The retentate was then continually 
diafiltered with 10 mM sodium phosphate (pH = 7.0) until the 
filtrate reached an A280 < 0.100. Unless otherwise stated, all 

20 procedures from this point were performed in buffer as defined by 
10 mM sodium phosphate (pH 7.0). The retentate was then 
concentrated to a final volume of approximately 0.20 L and filtered 
using a 0.45 mm Nalgene™ Filterware sterile filtration unit. The 
filtered material was loaded at 7.5 ml/min onto a Pharmacia HR16/10 

2 5 column which had been packed with PerSeptive Biosystem Poros^ 50 HQ 
strong anion exchange matrix equilibrated in buffer using a 
PerSeptive Biosystem Sprint® HPLC system. After loading, the 
column was washed with buffer until an A28O < 0.100 was achieved. 

Proteins were then eluted from the column at 2.5 ml/min using 
30 buffer with 0.4 M NaCl for 20 min for a total volume of 50 ml. The 
column was then washed using buffer with 1.0 M NaCl at the same 
flow rate for an additional 20 min (final volume = 50 ml ) . 
Proteins eluted with 0.4 M and 1.0 M NaCl were placed in separate 
dialysis bags < Spectra/ Por® Membrane MWCO: 2,000) and allowed to 
35 dialyze overnight at 4° C in 12 L buffer. The majority of the 

activity against SCR was contained in the 0.4 M fraction. The 0.4 
M fraction was further purified by application of 20 ml to a 
Pharmacia XK 2 6/100 column that had been prepacked with Sepharose 
CL4B (Pharmacia) using a flow rate of 0.75 ml/min. Fractions were 
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volume of 0.75 ml using a Millipore Ul traf ree®- 15 centrifugal 
filter device Biomax-50K NMWL membrane. Protein concentrations 
were determined using a Biorad Protein Assay Kit with bovine gamma 
globulin as a standard. 

r^racterization 

The native molecular weight of the SCR toxin complex was 
determined using a Pharmacia HR 16/50 that had been prepacked with 
Sepharose CL4B in buffer. The column was then calibrated using 
proteins of known molecular size thereby allowing for calculation 
of the toxin approximate native molecular size. As shown in Table 
21, the molecular size of the toxin complex ranged from 777 kDa 
with strain Hb to 1,900 kDa with strain WX-14. The yield of toxin 
complex also varied, from strain WX-12 producing 0.8 mg/L to strain 
Hb, which produced 7.0 mg/L. 

Proteins found in the toxin complex were examined for 
individual polypeptide size using SDS-PAGE analysis. Typically, 20 
mg protein of the toxin complex from each strain was loaded onto a 
2-15% polyacrylamide gel (Integrated Separation Systems) and 
eiectrophoresed at 20 mA in Biorad SDS-PAGE buffer. After 
completion of electrophoresis, the gels were stained overnight in 
Biorad Coomassie blue R-250 (0.2% in methanol: acetic acid: water; 
40:10:40 v/v/v) . Subsequently, gels were destained in 
methanol .-acetic acid: water; 40:10:40 (v/v/v). The gels were then 
rinsed with water for 15 min and scanned using a Molecular Dynamics 
Personal Laser Densitometer 0 . Lanes were quantitated and molecular 
sizes were calculated as compared to Biorad high molecular weight 
standards, which ranged from 200-45 kDa. 

Sizes of the individual polypeptides comprising the SCR toxin 
complex from each strain are listed in Table 22. The sizes of the 
individual polypeptides ranged from 230 kDa with strain WX-1 to a 
size of 16 kDa, as seen with strain WX-7. Every strain, with the 
exception of strain Hb, had polypeptides comprising the toxin 
complex that were in the 160-230 kDa range, the 100-160 kDa range, 
and the 50-80 kDa range. These data indicate that the toxin 
complex may vary in peptide composition and components from strain 
to strain, however, in all cases the toxin attributes appears to 
consist of a large, oligomeric protein complex. 
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Table 21 

Characterization of a Toxin Complex from 
Non W-14 Photorhabdus Strains 

5 



strain 


Approx . 


Vield 




Native 


Active 




Molecular Wt. a 


Fraction 
<mg/L) b 


H9 


972,000 


l.B 


Hb 


777,000 


7 .0 


Hm 


1,400,000 


1 . 1 


HP88 


813,000 


2.5 


NCI 


1, 092, 000 


3 .3 


WIR 


979, 000 


1.0 


WX-1 


973 , 000 


0.8 


WX-2 


951,000 


2.2 


WX-7 


1, 000,000 


1.5 


WX-12 


898,000 


0.4 


WX-14 


1,900,000 


1.9 


W-14 


860 , 000 


7.5 



a Native molecular weignt determined using a Fnarmacia HR 

16/50 column packed with Sepharose CL4B 
b Amount of toxin complex recovered from culture broth. 



Activity Spectrum 

As shown in Table 23, the toxin complexes purified from 

10 strains Hm and H9 were tested for activity against a variety of 
insects, with the toxin complex from strain W-14 for comparison. 
The assays were performed as described in Example 13. The toxin 
complex from all three strains exhibited activity against tobacco 
bud worm, European corn borer, Southern corn root worm, and aster 

15 leaf hopper. Furthermore, the toxin complex from strains Hm and W- 
14 also exhibited activity against two-spotted spider mite. In 
addition, the toxin complex from W-14 exhibited activity against 
mosquito larvae. These data indicate that the toxin complex, while 
having similarities in activities between certain orders of 

20 insects, can also exhibit differential activities against other 
orders of insects . 
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Table 23 

Observed Insecticidal Spectrum of a Purified Toxin Complex from 

PhotQzhabdus Strains 



10 



15 



pnotornaoaus strain " 

Hm Toxin Complex 
H9 Toxin Complex 
W-14 Toxin Complex 



sensitive* insect species 

1**, 2, 3, 5, 6, 7, 8 

1, 2, 3, 6, 7, 8 

1, 2, 3, 4, 5, 6, 7, B 



* = > ^b* mortality or growtn inniJDition 

* = > 25% mortality or growth inhibition 

** = 1, Tobacco bud worm; 2, European corn borer; 3, Southern 
corn root worm; 4, Mosquito; 5, Two-spotted spider mite; 
6, Aster Leaf hopper; 7, Fruit Fly; 8, Boll Weevil 



20 



Example 15 

Sub-Fractionation of Photorhabdus Protein Toxin Complex 



The Photorhabdus protein toxin complex was isolated as 
described in Example 14. Next, about 10 mg toxin was applied to a 
MonoQ 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at a flow 

25 rate of lml/min. The column was washed with 20 mM Tris-HCl, pH 7 . 0 
until the optical density at 280 nm returned to baseline 
absorbance . The proteins bound to the column were eluted with a 
linear gradient of 0 to 1.0 M NaCl in 20 mM Tris-HCl, pH 7 . 0 at 1 
ml/min for 30 min. One ml fractions were collected and subjected 

30 to Southern corn rootworm (SCR) bioassay (see Example 13) . Peaks 
of activity were determined by a series of dilutions of each 
fraction in SCR bioassays . Two activity peaks against SCR were 
observed and were named A (eluted at about 0.2-0.3 M NaCl} and B 
(eluted at 0.3-0.4 M NaCl) . Activity peaks A and B were pooled 

35 separately and both peaks were further purified using a 3 -step 
procedure described below. 

Solid (NH4>2S04 was added to the above protein fraction to a 
final concentration of 1.7 M. Proteins were then applied to a 
phenyl -Superose 5/5 column equilibrated with 1.7 M (NH4)2SC>4 in 50 

4 0 mM potassium phosphate buffer, pH 7 at 1 ml/min. Proteins bound to 
the column were eluted with a linear gradient of 1.7 M (NH4>2SC>4, 
0% ethylene glycol, 50 mM potassium phosphate, pH 7 . 0 to 25% 
ethylene glycol, 25 mM potassium phosphate, pH 7 . 0 (no (NH4>2S04) 
at 0.5 ml/min. Fractions were dialyzed overnight against 10 mM 

4 5 sodium phosphate buffer, pH 7.0. Activities in each fraction 
against SCR were determined by bioassay. 



-98- 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 9B08932A1J_> 



c 



WO 98/08932 



PCT/US97/07657 



The fractions 




th the highest . activity wer< 




ioled and 



10 



15 



20 



25 



30 



35 



applied to a MonoQ 5/5 column which was equilibrated with 20 mM 
Tris-HCl, pH 7 . 0 at 1 ml/min. The proteins bound to the column 
were eluted at 1 ml/min by a linear gradient of 0 to 1M NaCl in 20 
mM Tris-HCl, pH 7.0. 

For the final step of purification, the most active fractions 
above (determined by SCR bioassay) were pooled and subjected to a 
second phenyl -Sup e rose 5/5/ column. Solid (NH4)2S04 was added to a 
final concentration of 1.7 M. The solution was then loaded onto 
the column equilibrated with 1.7 M (NH4)2S04 in 50 mM potassium 
phosphate buffer, pH 7 at lrnl/min. Proteins bound to the column 
were eluted with a linear gradient of 1.7 M (NH4>2S04, 50 mM 
potassium phosphate, pH 7 . 0 to 10 mM potassium phosphate, pH 7 . 0 
at 0.5 ml/min. Fractions were dialyzed overnight against 10 mM 
sodium phosphate buffer, pH 7.0. Activities in each fraction 
against SCR were determined by bioassay. 

The final purified protein by the above 3 -step procedure from 
peak A was named toxin A and the final purified protein from peak B 
was named toxin B . 

Characterization and.Amino Acid Sequencing of Toxin A and Toxin B 

In SDS-PAGE, both toxin A and toxin B contained two major 
(> 90% of total Commassie stained protein) peptides: 192 kDa (named 
Al and Bl, respectively) and 58 kDa (named A2 and B2 , 
respectively) . Both toxin A and toxin B revealed only one major 
band in native PAGE, indicating Al and A2 were subunits of one 
protein complex, and Bl and B2 were subunits of one protein 
complex. Further, the native molecular weight of both toxin A and 
toxin B were determined to be 860 kDa by gel filtration 
chromatography. The relative molar concentrations of Al to A2 was 
judged to be a 1 to 1 equivalence as determined by densiometric 
analysis of SDS-PAGE gels. Similarly, Bl and B2 peptides were 
present at the same molar concentration. 

Toxin A and toxin B were electrophoresed in 10% SDS-PAGE and 
transblotted to PVDF membranes. Blots were sent for amino acid 
analysis and N-terminal amino acid sequencing at Harvard MicroChem 
and Cambridge ProChem, respectively. The N-terminal amino sequence 
of Bl was determined to be identical to SEQ ID N0:1, the TcbAii 
region of the tcbA gene (SEQ ID NO: 12, position 87 to 99) . A 
unique N-terminal sequence was obtained for peptide B2 (SEQ ID 
NO: 40) . The N-terminal amino acid sequence of peptide B2 was 
identical to the TcbAiii region of the derived amino acid sequence 
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35 



for the tcjb^^^ene (SEQ ID NO: 12, position 1935 to 1945) . 
Therefore, the B toxin contained predominantly two peptides, TcbAii 
and TcbAiii, that were observed to be derived from the same gene 
product, TcbA. 

The N- terminal sequence of A2 (SEQ ID NO: 41) was unique in 
comparison to the TcbAiii peptide and other peptides. The A2 
peptide was denoted TcdAiii (see Example 17) . SEQ ID NO: 6 was 
determined to be a mixture of amino acid sequences SEQ ID NO: 40 and 
41 . 

Peptides Al and A2 were further subjected to internal amino 
acid sequencing. For internal amino acid sequencing, 10 /xg of 
toxin A was electrophoresized in 10% SDS-PAGE and transblotted to 
PVDF membrane. After the blot was stained with amido black, 
peptides Al and A2 , denoted TcdAii and TcdAiii, respectively, were 
excised from the blot and sent to Harvard MicroChem and Cambridge 
ProChem. Peptides were subjected to trypsin digestion followed by 
HPLC chromatography to separate individual peptides. N-terminal 
amino acid analysis was performed on selected tryptic peptide 
fragments. Two internal amino acid sequences of peptide Al 
(TcdAii-PK71, SEQ ID NO:38 and TcdAii-PK44, SEQ ID NO;39) were 
found to have significant homologies with deduced amino acid 
sequences of the TcbAii region of the tcbA gene (SEQ ID NO: 12) . 
Similarly, the N- terminal sequence {SEQ ID NO: 41) and two internal 
sequences of peptides A2 (TcdAiii -PK57 , SEQ ID NO:42 and TcdAiii- 
PK2 0, SEQ ID NO. 43) also showed significant homology with deduced 
amino acid sequences of TcbAiii region of the tcbA gene (SEQ ID 
NO: 12) . 

In summary of above results, the toxin complex has at least 
two "active protein toxin complexes against SCR; toxin A and toxin 
B. Toxin A and toxin B are similar in their natfve~and subunits 
molecular weight, however, their peptide compositions are 
different. Toxin A contained peptides TcdAii and TcdAiii a s the 
major peptides and the toxin B contains TcbAii and TcbAiii as the 
major peptides. 

Purification and Characterization of Toxin C, Tea Peptides 

The Photorhabdus protein toxin complex was isolated as 
described above. Next, about 50 mg toxin was applied to a MonoQ 
10/10 column equilibrated with 20 mM Tris-HCl, pH 7 . 0 at a flow 
rate of 2 ml/min. The column was washed with 20 mM Tris-HCl, pH7 . 0 
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until the optical 




ity at 280 nm .returned to b« 



m 



ine level . 



The proteins bound to the column were eluted with a linear gradient 
of 0 to 1M NaCl in 20 mM Tris-HCl, pH 7.0 at 2 ml/min for 60 min. 
2 ml fractions were collected and subjected to Western analysis 
5 using pAb TcaBii-syn antibody {see Example 21) as the primary 
antibody. Fractions reacted with pAb TcaBii-syn antibody were 
combined and solid (NH4) 2 so 4 was added to a final concentration of 

1.7 M. Proteins were then applied to a phenyl -Superose 10/10 
column equilibrated with 1.7 M (NH4>2S04 in 50 mM potassium 

10 phosphate buffer, pH 7 at lml/min. Proteins bound to the column 
were eluted with a linear gradient of 1.7 M (NH4)2S04, 50 mM 

potassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH 7.0 
at 1 ml/min for 120 min. 2ml Fractions were collected, dialyzed 
overnight against 10 mM sodium phosphate buffer, pH 7.0, and 
15 analyzed by Western blots using pAb TcaBii-syn antibody as the 

primary antibody. 

Fractions cross-reacted with the antibody were pooled and 
applied to a MonoQ 5/5 column which was equilibrated with 20 mM 
Tris-HCl, pH 7.0 at lml/min. The proteins bound to the column were 
20 eluted at lml/min by a linear gradient of 0 to 1M NaCl in 20 mM 
Tris-HCl, pH 7.0 for 30 min. 

Fractions above reacted with pAb TcaBii-syn antibody were 

pooled and subjected to a phenyl -Superose 5/5/ column. Solid 
(NH4)2 so 4 added to a final concentration of 1.7 M. The solution 
2 5 was then applied onto the column equilibrated with 1.7 M (NH4)2S04 
in 50 mM potassium phosphate buffer, pH 7 at lml/min. Proteins 
bound to the column were then eluted with a linear gradient of 1.7 
M (NH4) 2SO4 , 50 mM potassium phosphate, pH 7 . 0 to 10 mM potassium 

phosphate, pH 7.0 at 0.5 ml/min for 60 min. Fractions were 
30 dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0. 

For the final purification step, fractions reacted with pAb 
TcaBii-syn antibody above determined by Western analysis were 

combined and applied to a Mono 0 5/5 column equilibrated with 20 mM 
Tris-HCl, pH 7.0 at lml/min. The proteins bound to the column were 
35 eluted at lml/min by a linear gradient of 0 to 1M NaCl in 20 mM 
Tris-HCl, pH 7.0 for 30 min. 

The final purified protein fraction contained 6 major peptides 
examined by SDS-PAGE: 165 kDa , 90 kDa , 64 kDa , 62 kDa , 58 kDa, and 
22 kDa. The LD50 of the insecticidal activities of this purified 
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fraction we: 




.etermined to be 100 ng and 




ng against SCR and 



ECB , respectively . 

The above peptides were blotted to PVDF membranes and blots 
were sent for amino acids analysis and 5 amino acid long terminal 
5 sequencing at Harvard MicroChem and Cambridge ProChem, 

respectively. The N- terminal amino acid sequence of the 165 kDa 
peptide was determined to be identical to peptide TcaC (SEQ ID 2, 
position 1 to 5) . The N- terminal amino acid sequence of the 90 kDa 
peptide was determined to be TcaAii region of the derived amino 

10 acid sequence for the tcaA gene {SEQ ID NO 33, position 254 to 
258) . The N-terminal amino acid sequence of 64 kDa peptide was 
determined to be identical to peptide TcaBi (SEQ ID 3, position 1 

to 5) . The N-terminal amino acid sequence of the 62 kDa peptide 
was determined to be TcaAii region of the derived amino acid 

15 sequence for the tcaA gene (SEQ ID NO 33 f position 489 to 493). 

The N-terminal amino acid sequence of 58 kDa peptide was determined 
to be identical to peptide TcaBii (SEQ ID 5, position 1 to 5) . The 

N-terminal amino acid sequence of the 22 kDa peptide (SEQ ID NO 62) 
was determined to be TcaAi region, denoted TcaAi v , of the derived 

20 amino acid sequence for the tcaA gene (SEQ ID NO 34, position 98 to 
102) . It is noted that all tcaA, tcaB f and tcaC genes reside in 
the same tea operon (Fig. 6A) . 

Five jig of purified Tea fraction, purified toxin A, and 
purified toxin B were analyzed by Western blot using the following 

25 antibodies individually as primary antibody: pAb TcaBii-syn 

antibody, mAb CF52 antibody, pAb TcdAii-syn antibody, and pAb 
Tcdiii-syn antibody (Example 21) . With pAb TcaBii-syn antibody 

only the purified Tea peptides fraction reacted, but not toxin A or 
toxin B. With mAb CF52 antibody, only toxin B reacted but not Tea 

30 peptides fraction or toxin A. With either pAb TcdAii-syn antibody 
or pAb Tcdiii-syn antibody only toxin A reacted, but not Tea 
peptides fraction or toxin B . This indicated that the insecticidal 
activity observed in the purified Tea peptides fraction is 
independent of toxin A and toxin B. The purified Tea peptide 

35 fraction is a third unique protein toxin, denoted toxin C. 
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Example 16 

Cleavage and Activation of TcbA Peptide 

In the toxin B complex, peptide TcbAii and TcbAiii originate 

5 from the single gene product TcbA (Example 15) . The processing of 
TcbA peptide to TcbAii anci TcbAiii is presumably by the action of 

Photorhabdus protease (s) , and most likely, the metalloproteases 
described in Example 10. In some cases, it was noted that when 
Photorhabdus W-14 broth was processed, TcbA peptide was present in 
10 toxin B complex as a major component, in addition to peptides 
TcbAn and TcbAiii. Identical procedures, described for the 

purification of toxin B complex (Example 15) , were used to enrich 
peptide TcbA from toxin complex fraction of W-14 broth. The final 
purified material was analyzed in a 4-20% gradient SDS-PAGE and 
15 major peptides were quantified by densitometry. It was determined 
that TcbA, TcbAii and TcbAiii comprised 58%, 36%, and 6%, 

respectively, of total protein. The identities of these peptides 
were confirmed by their respective molecular sizes in SDS-PAGE and 
Western blot analysis using monospecific antibodies. The native 

20 molecular weight of this fraction was determined to be 860 kDa . 

The cleavage of TcbA was evaluated by treating the above 
purified material with purified 38 kDa and 58 kDa W-14 Photorhabdus 
metalloproteases (Example 10) , and trypsin as a control enzyme 
(Sigma, MO). The standard reaction consisted 17.5 tig the above 

2 5 purified fraction, 1.5 unit protease, and 0.1 M Tris buffer, pH 8.0 
in a total volume of 100 jzl . For the control reaction, protease 
was omitted. The reaction mixtures were incubated at 37°C for 90 
min. At the end of the reaction, 20 /il was taken and boiled with 
SDS-PAGE sample buffer immediately for electrophoresis analysis in 

30 a 4-20% gradient SDS-PAGE. It was determined from SDS-PAGE that in 
both 38 kDa and 58 kDa protease treatments, the amount of peptides 
TcbAii and TcbAiii increased about 3 -fold while the amount of TcbA 

peptide decreased proportionally (Table 24). The relative 
reduction and augmentation of selected peptides was confirmed by 
35 Western blot analyses. Furthermore, gel filtration of the cleaved 
material revealed • that the native molecular size of the complex 
remained the same. Upon trypsin treatment, peptides TcbA and 
TcbAii were nonspecif ical ly digested into small peptides. This 

indicated that 3 8 kDa and 58 kDa Photorhabdus proteases can 
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specif ically^Jfocess peptide TcbA into peptides TcbAii and TcbAiii. 
Protease treated and untreated control of the remaining 80 /xl 
reaction mixture were serial diluted with 10 mM sodium phosphate 
buffer, pH 7.0 and analyzed by SCR bioassay. By comparing activity 
5 in several dilution, it was determined that the 3 8 kDa protease 

treatment increased SCR insecticidal activity approximately 3 to 4 
fold. The growth inhibition of remaining insects in the protease 
treatment was also more severe than control (Table 24) . 

10 T^frjLe 24 

Conversion and Activation of Peptide TcbA into Peptides TcbA^ anfl 

TcbAjjj. by Protease Treatment 



Control 


id KUa protease treatment 


TcbA U ot total protein) 


58 


iy 


TcbAii (% °f total protein) 


36 


64 


TcbAiii(% of total protein) 


6 


18 


LUbU ijig protein) 


2 . 1 


0 .52 


SCR Weight (mg/insect) * 


0.2 


0 . 1 



15 



20 weight of live insect after 5 days on diet in the assay. 

Activation and Procession of Toxin B by SCR Gut Proteases 

In yet a second demonstration of proteolytic activation, it 
was examined whether W-14 toxins are processed by insects. Toxin B 

25 purified from Photorhabdus W-14 broth (see Example 15) was 

comprised of predominantly intact TcbA peptides as judged by SDS- 
PAGE and Western blot analysis using monoclonal antibody. The LD50 
of this fraction against SCR was determined to be around 700 ng . 

SCR larva were grown on coleopteran diet until they reached 

30 the fourth instar stage (about 100-125 mg total weight each 

insect) . SCR gut content was collected as follows*. the guts were 
removed using dissecting scissors and forceps. After removing the 
excess fatty material that coats the gut lining, about 40 guts were 
homogenized in a microcentrifuge tube containing 100 /ul sterile 

35 water. The tube was then centrifuged at 14,000 rpm for 10 minutes 
and the pellet discarded. The supernatant was stored at a -70°C 
freezer until use. 

The processing of toxin B by insect gut was evaluated by 
treating the above purified toxin B with the SCR gut content 

40 collected. The reaction consisted 40 ^g toxin B (1 mg/ml) , 50 /il 
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SCR gut content:, an^^. 1M Tris buffer, pH 8.0 in volume of 

100 ill. For the control reaction, SCR gut content was omitted. 
The reaction mixtures were incubated at 37°C for overnight. At the 
end of reaction, 10 /il was withdraw and boiled with equal volume 2x 
5 SDS-PAGE sample buffer for SDS-PAGE analysis. The remaining 90 j/1 
reaction mixture was serial diluted with 10 mM sodium phosphate 
buffer, pH 7 . 0 and analyzed by SCR bioassay. SDS-PAGE analysis 
indicated in SCR gut content treatment, peptide TcbA was digested 
completely into smaller peptides. Analysis of the undenatured 

10 toxin fraction showed that the native size, about 860 kDa , remained 
the same even though larger peptides were fragmented. In SCR 
bioassays, it was found that the LD50 of SCR gut treated toxin B to 
be about 70 ng ; representing a 10- fold increase. In a separate 
experiment, protease K treatment completely eliminated toxin 

15 activity. 

Example 17 

Screening of the Library for a Gene Encoding the TcdAj_j_ Peptide 

2 0 The cloning and characterization of a gene encoding the TcdAii 

peptide, described as SEQ ID NO: 17 (internal peptide TcdAii-PTlll 
N-terminal sequence) and SEQ ID NO: 18 (internal peptide TcdAii~PT79 
N- terminal sequence) was completed. Two pools of degenerate 
oligonucleotides, designed to encode the amino acid sequences of 

25 SEQ ID NO-.17 (Table 25) and SEQ ID N0:18 (Table 26), and the 
reverse complements of those sequences, were synthesized as 
described in Example 8. The DNA sequence of the oligonucleotides 
is given below: 



30 
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Polymerase Chi 




Reactions (PCR) were perfo: 



essentially as 
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described in Example 8, using as forward primers P2.3.6.CB or 
P2.3.5, and as reverse primers P2.7 9.R.1 or P2.7 9R.CB, in all 
forward/reverse combinations, using Photorhabdus W-14 genomic DNA 
as template. In another set of reactions, primers P2.79.2 or 
P2.7 9.3 were used as forward primers, and P2.3.5R, P2 . 3 . 5RI , and 
P2.3R.CB were used as reverse primers in all forward/ reverse 
combinations. Only in the reactions containing P2.3.6.CB as the 
forward primers combined with P2.7 9.R-l or P2.7 9R.CB as the reverse 
primers was a non- art if actual amplified product seen, of estimated 
size (mobility on agarose gels) of 2500 base pairs. The order of 
the primers used to obtain this amplification product indicates 
that the peptide fragment TcdAii-PTlll lies amino -proximal to the 
peptide fragment TcdAii~PT79. 

The 2500 bp PCR products were ligated to the plasmid vector 
pCR ,M II (Invitrogen, San Diego, CA) according to the supplier's 
instructions, and the DNA sequences across the ends of the insert 
fragments of two isolates (HS24 and HS27) were determined using the 
supplier's recommended primers and the sequencing methods described 
previously. The sequence of both isolates was the same. New 
primers were synthesized based on the determined sequence, and used 
to prime additional sequencing reactions to obtain a total of 2557 
bases of the insert [SEQ ID NO: 36] . Translation of the partial 
peptide encoded by SEQ ID No: 36 yields the 845 ammo acid sequence 
disclosed as SEQ ID NO: 37. Protein homology analysis of this 
portion of the TcdAii peptide fragment reveals substantial amino 

acid homology ((68% similarity , and 53% identity using the Wisconsin 
Package Version 8.0, Genetics Computer Group (GCG) , Madison, WI) to 
residues 542 to 1390 of protein TcbA [SEQ ID NO: 12] or (60% 
similarity, and 54% identity using the Wisconsin Package Version 
9.0, Genetics Computer Group (GCG), Madison, WI to residues 567 to 
1389) ) . It is therefore apparent that the gene represented in part 
by SEQ ID NO: 36 produces a protein of similar, but not identical, 
amino acid sequence as the TcbA protein, and which likely has 
similar, but not identical biological activity as the TcbA protein. 

In yet another instance, a gene encoding the peptides TcdAii - 
PK44 and the TcdAiii 58 kDa N-terminal peptide, described as SEQ ID 
NO:39 (internal peptide TcdAii-PK44 sequence), and SEQ ID 
NO: 41 (TcdAiii 58 kDa N-terminal peptide sequence) was isolated. 
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Two pools o^^fegenerate oligonucleotides, designed to encode the 
amino acid sequences described as SEQ ID NO: 39 (Table 28) and SEQ 
ID NO: 41 (Table 27) , and the reverse complements of those 
sequences, were synthesized as described in Example 8, and their 
5 DNA sequences . 
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Polymer. 




Chain Reactions (PCR.) were performed essentially as 
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described in Example 8, using as forward primers Al.44.1 or 
Al.44.2, and reverse primers A2.3R or A2.4R, in all forward/ reverse 
combinations, using Photorhabdus W-14 genomic DNA as template. In 
another set of reactions, primers A2 . 1 or A2.2 were used as forward 
primers, and A1.44.1R, and A1.44.2R were used as reverse primers in 
all forward/ reverse combinations. Only in the reactions containing 
Al.44.1 or Al.44.2 as the forward primers combined with A2.3R as 
the reverse primer was a non-artif actual amplified product seen, of 
estimated size {mobility on agarose gels) of 1400 base pairs. The 
order of the primers used to obtain this amplification product 
indicates that the peptide fragment TcdAii-PK44 lies amino -proximal 
to the 58 kDa peptide fragment of TcdAiii. 

The 1400 bp PCR products were ligated to the plasmid vector 
pCR m II according to the supplier's instructions. The DNA sequences 
across the ends of the insert fragments of four isolates were 
determined using primers similar in sequence to the supplier's 
recommended primers and using sequencing methods described 
previously. The nucleic acid sequence of all isolates differed as 
expected in the regions corresponding to the degenerate primer 
sequences, but the amino acid sequences deduced from these data 
were the same as the actual amino acid sequences for the peptides 
determined previously, (SEQ ID N0S:41 and 39). 

Screening of the W-14 genomic cosmid library as described in 
Example 8 with a radiolabeled probe comprised of the DNA prepared 
above (SEQ ID NO: 36) identified five hybridizing cosmid isolates, 
namely 17D9, 20B10, 21D2, 27B10, and 26D1 . These cosmids were 
distinct from those previously identified with probes corresponding 
to the genes described as SEQ ID NO: 11 or SEQ ID NO: 25. 
Restriction enzyme analysis and DNA blot hybridizations identified 
three EcoR I fragments, of approximate sizes 3.7, 3.7, and 1.1 kbp, 
that span the region comprising the DNA of SEQ ID NO:36. Screening 
of the W-14 genomic cosmid library using as probe the radiolabeled 
1.4 kbp DNA fragment prepared in this example identified the same 
five cosmids (17D9, 20B10, 21D2, 27B10, and 26D1) . DNA blot 
hybridization to EcoR I-digested cosmid DNAs also showed 
hybridization to the same subset of EcoR I fragments as seen with 
the 2.5 kbp TcdAii gene probe, indicating that both fragments are 
encoded on the genomic DNA. 
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DNA sequence 




ermination of the cloned Eel 



fragments 
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revealed an uninterrupted reading frame of 7551 base pairs <SEQ ID 
NO:46) , encoding a 282.9 JcDa protein of 2516 amino acids (SEQ ID 
NO: 47) . Analysis of the amino acid sequence of this protein 
revealed all expected internal fragments of peptides TcdAii (SEQ ID 
NOS:17, 18, 37, 38 and 39) and the TcdAiii peptide N-terminus (SEQ 
ID NO:41) and all TcdAiii internal peptides (SEQ ID NOS:42 and 43). 
The peptides isolated and identified as TcdAii and TcdAiii are each 
products of the open reading frame, denoted tcdA, disclosed as SEQ 
ID NO:46. Further, SEQ ID NO:47 shows, starting at position 89, 
the sequence disclosed as SEQ ID NO: 13, which is the N-terminal 
sequence of a peptide of size approximately 201 JcDa, indicating 
that the initial protein produced from SEQ ID NO: 46 is processed 
in a manner similar to that previously disclosed for SEQ ID NO: 12. 
In addition, the protein is further cleaved to generate a product 
of size 209.2 kDa, encoded by SEQ ID NO: 48 and disclosed as SEQ ID 
NO:49 (TcdAii peptide), and a product of size 63.6 kDa, encoded by 
SEQ ID NO:50 and disclosed as SEQ ID NO:51 (TcdAiii peptide) . Thus, 
it is thought that the insecticidal activity identified as toxin A 
(Example 15) derived from the products of SEQ ID NO: 46, as 
exemplified by the full-length protein of 282.9 kDa disclosed as 
SEQ ID NO: 47, is processed to produce the peptides disclosed as SEQ 
ID NOS:49 and 51. It is thought that the insecticidal activity 
identified as toxin B (Example 15) derives from the products of SEQ 
ID NO: 11, as exemplified by the 280.6 kDa protein disclosed as SEQ 
ID NO: 12. This protein is proteolyt ical ly processed to yield the 
207.6 kDa peptide disclosed as SEQ ID NO: 53, which is encoded by 
SEQ ID NO: 52, and the 62.9 kDa peptide having N-terminal sequence 
disclosed as SEQ ID NO: 40, and further disclosed as SEQ ID NO: 55, 
which is encoded by SEQ ID NO: 54. 

Amino acid sequence comparisons between the proteins disclosed 
as SEQ ID NO: 12 and SEQ ID NO: 47 reveal that they have 69% 
similarity and 54% identity using the Wisconsin Package Version 
8.0, Genetics Computer Group (GCG) , Madison, WI or 60% similarity 
and 54% identity using version 9.0 of the program. This high 
degree of evolutionary relationship is not uniform throughout the 
entire amino acid sequence of these peptides, but is higher towards 
the carboxy- terminal end of the proteins, since the peptides 
disclosed as SEQ ID NO: 51 (derived from SEQ ID NO: 47) and SEQ ID 
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NO: 55 (derives from SEQ ID NO: 12) have 76% similarity and 64% 
identity using the Wisconsin Package Version 8.0, Genetics Computer 
Group (GCG) , Madison, WI or 71% similarity and 64% identity using 
version 9.0 of the program. 



Example IP 

Control of European Cornborer- Induced Leaf Damage on Mai?e Plants 
bv Sprav Application of Photorhabdus (Strain w-14) Broth 

10 

The ability of Photorhabdus toxin (s) to reduce plant damage 
caused by insect larvae was demonstrated by measuring leaf damage 
caused by European corn borer (Ostrinia nubilalis) infested onto 
maize plants treated with Photorhabdus broth. Fermentation broth 

15 from Photorhabdus strain W-14 was produced and concentrated 

approximately 10-fold using ultrafiltration (10,000 MW pore-size) 
as described in Example 13. The resulting concentrated broth was 
then filter sterilized using 0.2 micron nitrocellulose membrane 
filters. A similarly prepared sample of uninoculated 2% proteose 

20 peptone #3 was used for control purposes. Maize plants (an inbred 
line) were grown from seed to vegetative stage 7 or 8 in pots 
containing a soilless mixture in a greenhouse (27°C day; 22°C 
night, about 50%RH, 14 hr day- length, watered/fertilized as 
needed) . The test plants were arranged in a randomized complete 

2 5 block design (3 reps / treatment , 6 plants/ treatment ) in a greenhouse 

with temperature about 22°C day; 18°C night, no artificial light 
and with partial shading, about 50%RH and watered/fertilized as 
needed! Treatments (uninoculated media and concentrated 
Photorhajbdus broth) were applied with a syringe sprayer, 2.0 mis 
30 applied from directly (about 6 inches) over the whorl and 2.0 

additional mis applied in a circular motion from approximately one 
foot above the whorl. In addition, one group of plants received no 
treatment. After the treatments had dried (approximately 30 
minutes) , twelve neonate European corn borer larvae (eggs obtained 

3 5 from commercial sources and hatched in-house) were applied directly 

to the whorl. After one week, the plants were scored for damage to 
the leaves using a modified Guthrie Scale (Koziel, M . G . , Beland, 
G. L . , Bowman, C, Carozzi, N. B., Crenshaw, R. , Cross land, L . , 
Dawson, J., Desai , N. f Hill, M . , Kadwell , S., Launis , K. f Lewis, 
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K . , Maddox, D., Mc^son, K . , Meghj.i , M . Z., Mer^^ E . , Rhodes, 
R., Warren, G. W., Wright, M. and Evola, S. V. 1993). 

Bio/Technology, 11, 194-195.} and the scores were compared 
statistically [T-test (LSD) p<0.05 and Tukey's Studentized Range 
5 (HSD) Test p<0.1] . The results are shown in Table 29. For 
reference, a score of 1 represents no damage, a score of 2 
represents fine "window pane" damage on the unfurled leaf with no 
pinhole penetration and a score of 5 represents leaf penetration 
with elongated lesions and/or mid rib feeding evident on more than 
10 three leaves (lesions < 1 inch) . These data indicate that broth or 
other protein containing fractions may confer protection against 
specific insect pests when delivered in a sprayable formulation or 
when the gene or derivative thereof, encoding the protein or part 
thereof, is delivered via a transgenic plant or microbe. 

15 

Table 29 

Effect of PhQtvrhabdus Culture Broth on 
European corn Porer- Induced Le^f Bamaae on Maize 

20 Treatment Average Guthrie Score 

No Treatment 5 . 02 a 

Uninoculated medium 5 . 15 a 

Photorhabdus Broth 2.24 b 
Means with different letters are statistically different 
25 (p<0 . 05 or p<0 . 1) . 

Example 19 

Genetic Engineering of Genes for Expression in EL ooll 



30 Summary p£- Constructions 

A series of plasmids were constructed to express the tcbA gene 
of PhotorhaJbdus W-14 in Escherichia coli . A list of the plasmids 
is shown in Table 30. A brief description of each construction 
follows as well as a summary of the E. coli expression data 
3 5 obtained. 
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Table 3 0 

Expression Plasmifls for the tebA Gene 



Fiasmid 


Gene 


Vector/ selection 


compartment 










pUAB2U2b 


tC£)A 


pHC/Chl 


intracellular 


pDAB2U2fe 


tC£>A 


pACCiPfoVB/Amp 


uacuiovirus , 
secreted 


pDAB2U2 ) 


tCDA 


pjfi'^'/D/Kan 


Periplasm 


pDAU2U2H 


tcr>A 


piiTlb- CCJD-A 


intracellular 



Construction of pDAB2Q25 

In Example 9, a large EcoR I fragment which hybridizes to the 
TcbAii probe is described. This fragment was subcloned into pBC 

(Stratagene, La Jolla CA) to create pDAB2025. Sequence analysis 
10 indicates that the fragment is 8816 base pairs. The fragment 

encodes the tcbA gene with the initiating ATG at position 571 and 
the terminating TAA at position 8086 . The fragment therefore 
carries 570 base pairs of Photorhabdus DNA upstream of the ATG and 
73 0 base pairs downstream of the TAA. 

15 

Construction of Plasmid pDAB2Q2£ 

The tcbA gene was PCR amplified from plasmid pDAB2025 using 
the following primers; 5' primer (SlAcSl) 5' TTT AAA CCA TGG GAA 
ACT CAT TAT CAA GCA CTA TC 3' and 3' primer (S1AC31) 5' TTT AAA GCG 

2 0 GCC GCT TAA CGG ATG GTA TAA CGA ATA TG 3 ' . PCR was performed using 
a TaKaRa LA PCR kit from PanVera (Madison, WI ) in the following 
reaction: 57.5 microliters water, 10 microliters 10X LA buffer, 16 
microliters dNTPs (2.5 mM each stock solution), 20 microliters each 
primer at 10 pmoles/ microliters, 300 ng of the plasmid pDAB2025 

2 5 containing the W-14 tcbA gene and one microliter of TaKaRa LA Taq 
polymerase. The cycling conditions were 98°C/20 sec, 68°C/5 min, 
72°C/10 min for 30 cycles. A PCR product of the expected about 
7526 bp was isolated in a 0.8% agarose gel in TBE (100 mM Tris, 90 
mM boric acid, 1 mM EDTA) buffer and purified using a Qiaex II kit 

30 from Qiagen (Chatsworth, CA) . The purified tcbA gene was digested 
with Nco I and Not I and ligated into the baculovirus transfer 
vector pAcGP67B (PharMingen (San Diego, CA) ) and transformed into 
DH5a £. coli. The resulting recombinant is called pDAB2026. The 
tcbA gene was then cut from pDAB2026 and transferred to pET27b to 
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create plasmid pDA^^7 . A missense mutation in tcbA gene was 

repaired in pDAB2027. 

The repaired tcbA gene contains two changes from the sequence 
shown in Sequence ID NO: 11; an A>G at 212 changing an asparagine 71 
to serine 71 and a G>A at 22 9 changing an alanine 77 to threonine 
77. These changes are both upstream of the proposed TcbAii N- 

terminus . 



Construction of pPAB2P28 
0 The tcbA coding region of pDAB2027 was transferred to vector 

pET15b. This was accomplished using shotgun ligations, the DNAs 
were cut with restriction enzymes Nco I and Xho I. The resulting 
recombinant is called pDAB2028. 

5 Expression q£ .TcbA in ff, coli from Plasmid PPAB2P23 

Expression of tcbA in E. coli was obtained by modification of 
the methods previously described by Studier et al . (Studier, F.W., 
Rosenberg, A . , Dunn, J., and Dubendorff, J., (1990) Use of T7 RNA 
polymerase to direct expression of cloned genes. Methods Enzymol . , 

0 185: 60-89.). Competent E. coli cells strain BL2KDE3) were 

transformed with plasmid pDAB2028 and plated on LB agar containing 
100 pg/mL ampicillin and 40 mM glucose. The transformed cells were 
plated to a density of several hundred isolated colonies/plate. 
Following overnight incubation at 37°C the cells were scraped from 

5 the plates and suspended in LB broth containing 100 ^g/mL 

ampicillin. Typical culture volumes were from 200-500 mL. At time 
zero, culture densities (OD600) were from 0.05-0.15 depending on 
the experiment. Cultures were shaken at one of three temperatures 
(22°C, 30°C or 37°C) until a density of 0.15-0.5 was obtained at 

0 which time they were induced with 1 mM isopropylthio-p-galactoside 
(IPTG) . Cultures were incubated at the designated temperature for 
4-5 hours and then were transferred to 4°C until processing (12-72 
hours) . 



Purification and Characterization of TcbA Expressed in E.coli from 
Plasnuti PPAB2Q28 

E. coli cultures expressing TcbA peptides were processed as 
follows. Cells were harvested by centrif ugat ion at 17,000 x G and 
the media was decanted and saved in a separate container. 
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The medxa was concentrated about 8x using the Ml 2 (Amicon, 
Beverly MA) filtration system and a 100 kD molecular mass cut-off 
filter. The concentrated media was loaded onto an anion exchange 
column and the bound proteins were eluted with 1.0 M NaCl . The 1.0 
5 M NaCl elution peak was found to cause mortality against Southern 
corn rootworm (SCR) larvae Table 30). The 1.0 M NaCl fraction was 
dialyzed against 10 mM sodium phosphate buffer pH 7.0, 
concentrated, and subjected to gel filtration on Sepharose CL-4B 
(Pharmacia, Piscataway, NJ) . The region of the CL-4B elution 
10 profile corresponding to calculated molecular weight (about 900 

kDa) as the native W-14 toxin complex was collected, concentrated 
and bioassayed against larvae. The collected 900 kDa fraction was 
found to have insect icidal activity (see Table 31 below) , with 
symptomology similar to that caused by native W-14 toxin complex. 
15 This fraction was subjected to Proteinase K and heat treatment, the 
activity in both cases was either eliminated or reduced, providing 
evidence that the activity is proteinaceous in nature. In 
addition, the active fraction tested immunologically positive for 
the TcbA and TcbAiii peptides in immunoblot analysis when tested 
20 with an anti-TcbAiii monoclonal antibody (Table 31) . 



Table 31 

Results of Immunoblot and SCR Bioassays 



Fraction 


b'UK Activity 


lmmunoniot 


Native 
Size 




Mortalit 

y 


* urowtn 
Inhibit . 


peptides 
Detected 


ICL-4B 
Estimate 
d Size] 


TcbA Media 1.0 M 


+++ 


++ + 


I'cbA 




ion Exchange 










TcbA Media CL-4B 


++ + 


+++ 


TcbA, 
TcbAiii 


about 
900 kDa 


TcbA Media CL-4B 
+ Proteinase K 


+ + 


+ + + 


NT 




TCJDA Media LL-4B 
+ heat treatment 






NT 




TcbA Cell yup CL-4U 




+ + + 


NT 


about 
900 kD 


pk = proteinase k en 


=atment z n 


ours ; Heat 


treatment = l 


□o u c tor 10 



25 



minutes; ND = None Detected; NT = Not Tested. Scoring system for 
mortality and growth inhibition as compared to control samples; 5- 
24%="+" , 25-4 9%="++" , 50- 10 0%="+++" . 

30 The cell pellet was resuspended in 10 mM sodium phosphate 

buffer, pH=7.0, and lysed by passage through a Bio-Neb™ cell 
nebulizer (Glas-Col Inc. , Terra Haute, IN) . The pellets were 
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treated with DNase^^^ remove DNA and centrifuged 17,000 x g to 
separate the cell pellet from the cell supernatant. The 
supernatant fraction was decanted and filtered through a 0.2 micron 
filter to remove large particles and subjected to anion exchange 
5 chromatography. Bound proteins were eluted with 1.0 M NaCl f 

dialyzed and concentrated using Biomax™ (Millipore Corp, Bedford, 
MA) concentrators with a molecular mass cut-off of 50,000 Daltons. 
The concentrated fraction was subjected to gel filtration 
chromatography using Sepharose CL-4B beaded matrix. Bioassay data 

10 for material prepared in this way is shown in Table 3 0 and is 
denoted as "TcbA Cell Sup" . 

In yet another method to handle large amounts of material, the 
cell pellets were re-suspended in 10 mM sodium phosphate buffer, pH 
= 7.0 and thoroughly homogenized by using a Kontes Glass Company 

15 (Vineland, NJ) 40 ml tissue grinder. The cellular debris was 

pelleted by centrif ugat ion at 25,000 x g and the cell supernatant 
was decanted, passed through a 0.2 micron filter and subjected to 
anion exchange chromatography using a Pharmacia 10/10 column packed 
with Poros HQ 50 beads. The bound proteins were eluted by 

20 performing a NaCl gradient of 0.0 to 1.0 M. Fractions containing 
the TcbA protein were combined and concentrated using a 50 kDa 
concentrator and subjected to gel filtration chromatography using 
Pharmacia CL-4B beaded matrix. The fractions containing TcbA 
oligomer, molecular mass of approximately 900 kDa, were collected 

2 5 and subjected to anion exchange chromatography using a Pharmacia 
Mono Q 10/10 column equilibrated with 20 mM Tris buffer pH = 7.3. 
A gradient of 0 . 0 to 1.0 M NaCl was used to elute recombinant TcbA 
protein. Recombinant TcbA eluted from the column at a salt 
concentration of approximately 0.3-0.4 M NaCl, the same molarity at 

30 which native TcbA oligomer is eluted from the Mono Q 10/10 column. 
The recombinant TcbA fraction was found to cause SCR mortality in 
bioassay experiments similar to those in Table 31. 



A second set of expression constructions were prepared and tested 
35 for expression of the TcbA protein toxin. 

Construction of PDAB203Q ; An Expression Piasmiti £°- r the. tgJ?A 
Coding Region 

The plasmid pDAE2028 (see herein) contains the tcbA coding 
4 0 region in the commercial vector pET15 (Novagen, Madison, WI) , 



-117- 

SUBSTTTUTE SHEET (RULE 26) 



WO 98/08932 PCT/US97/07657 

encodes an c^^icillin selection marker. The plasmid pDAB2030 was 
created to express the tcJbA coding region from a plasmid which 
encodes a kanamycin selection marker. This was done by cutting 
pET27 (Novagen, Madison, WI) a kanamycin selection plasmid, and 
5 pDAB2028 with Xba I and Xho I. This releases the entire multiple 
cloning site, including the tcbA coding region from plasmid 
pDAB2028. The two cut plasmids, were mixed and ligated. 
Recombinant plasmids were selected on kanamycin and those 
containing the pDAB2028 fragment were identified by restriction 
10 analysis. The new recombinant plasmid is called pDAB2030. 

Construction Of Plasmid PDAB2Q31; Correction of Mutations in t.chA i 

The two mutations in the N- terminus of the tcbA coding region 
as described in Example 19 (Sequence ID NO: 11; A>G at 212 changing 

15 an asparagine 71 to serine 71; G>A at 229 changing an alanine 77 to 
threonine 77) were corrected as follows: A PCR product was 
generated using the primers TH50 (5' ACC GTC TTC TTT ACG ATC AGT G 
3' )and SlAc51 (5' TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA TC 3') 
and pDAB2025 as template to generate a 1778 bp product. This PCR 

20 product was cloned into plasmid pCR2 . 1 (Invitrogen, San Diego, CA) 
and a clone was isolated and sequenced. The clone was digested 
with Nco I and Pin AI and a 1670 bp fragment was purified from a 1% 
agarose gel. A plasmid containing the mutated tcbA coding region 
(pDAB2030) was digested with Nco I and Not I and purified away from 

2 5 the 1670 bp fragment in a 0.8% agarose with Qiaex II (Qiagen, 

Chatsworth, CA) . The corrected Nco I /Pin AI fragment was then 
ligated into pDAB2030. The ligated DNA was transformed into DH5a 
E. coli . A clone was isolated, sequenced and found to be correct. 
This plasmid, containing the corrected tcbA coding region, is 
30 called pDAB2031. 

Construction Of PDAB2Q33 and PPAS2Q34: Expression Plasmids for 

tci?ft 

The expression plasmids pDAB2025 and pDAB2027-203 1 all rely on 

3 5 the Bacteriophage T7 expression system. An additional vector 

system was used for bacterial expression of the tcbA gene and its 
derivatives. The expression vector Trc99a (Pharmacia Biotech, 
Piscataway, NJ) contains a strong trc promoter upstream of a 
multiple cloning site with a 5' Nco I site which is compatible with 
40 the tcbA coding region from pDAB2030 and 2031. However, the 

plasmid does not have a compatible 3' site. Therefore, the Hind 
III site of Trc99a was cut and made blunt by treatment with T4 DNA 
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r Mannheim, Indianapolis, I 




The vector 



10 



15 



20 



25 



30 



plasmid was then cut by Nco I followed by treatment with alkaline 
phosphatase. The plasmids pDAB2030 and pDAB2031 were each cut with 
Xho I (cuts at the 3' end of the tcbA coding region) followed by 
treatment with T4 DNA polymerase to blunt the ends. The plasmids 
were then cut with Nco I, the DNAs were extracted with phenol, 
ethanol precipitated and resuspended in buffer. The Trc99a and 
pDAB2030 and pDAB2031 plasmids were mixed separately, ligated and 
transformed into DH5a cells and plated on LB media containing 
ampicillin and 50 mM glucose. Recombinant plasmids were identified 
by restriction digestion. The new plasmids are called pDAB2033 
(contains the tcbA coding sequence with the two mutations in tcbA^) 

and pDAB2034 (contains the corrected version of tcbA from 
pDAB2 031) . 

Construction Qi Plasmid pPA32Q32; An Expression Plasmid for 

A plasmid encoding the TcbAiiA-^i portion of TcbA was created 

in a similar way as plasmid pDAB2031. A PCR product was generated 
using TH42 (5' TAG GTC TCC ATG GCT TTT ATA CAA GGT TAT AGT GAT CTG 
3') and TH5 0 (5' ACC GTC TTC TTT ACG ATC AGT G 3') primers and 
plasmid pDAB2025 as template. This yielded a product of 1521 bp 
having an initiation codon at the beginning of the coding sequence 
of tcbA^ . This PCR product was isolated in a 1% agarose gel and 

purified. The purified product was cloned into pCR2 . 1 as above and 
a correct clone was identified by DNA sequence analysis. This 
clone was digested with Nco I and Pin AT, a 1414 bp fragment was 
isolated in a 1% agarose gel and ligated into the Nco I and Pin AT 
sites of plasmid pDAB2030 and transformed into DH5a £. coli. This 
new plasmid, designed to express TcbAiiA^^^ in E. coli, is called 

pDAB2 032 . 

Expression of tcbA and tcbA ii A iii from Plasmids pDAB 2 0 3 0. PDAB2031 
and PDAB2Q32 

Expression of tcbA in E. coli from plasmids pDAB2030, pDAB2031 
and pDAB2032 was as described herein, except expression of 
tchA^Am was done in E. coli strain HMS174 (DE3) (Novagen, Madison, 

WI) . 
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Expression q^Ec£A from Plasnufl pPftB2P3.3 

The plasmid pDAB2033 was transformed into BL21 cells (Novagen, 
Madison, WI) and plated on LB containing 10 0 micrograms/mL 
ampicillin and 50 mM glucose. The plates were spread such that 
5 several hundred well separated colonies were present on each plate 
following incubation at either 30°C or 37°C overnight. The 
colonies were scraped from the plates and suspended in LB 
containing 100 micrograms/mL ampicillin, but no glucose. Typical 
culture volume was 250 mL in a single 1 L baffle bottom flask. The 

10 cultures were induced when the culture reached a density of 0.3-0.6 
OD600 ran. Most often this density was achieved immediately after 
suspension of the cells from the plates and did not require a 
growth period in liquid media. Two induction methods were used. 
Method 1: cells were induced with 1 mM IPTG at 37°C. The cultures 

15 were shaken at 200 rpm on a platform shaker for 5 hours and 

harvested. Method 2: The cultures were induced with 25 micromolar 
IPTG at 30°C and shaken at 200 rpm for 15 hours at either 20°C or 
30 °C. The cultures were stored at 4°C until used for purification. 

2 0 Purification of TcbA from £■ ooli 

Purification, bioassay and immunoblot analysis of TcbA and 
TcbAi^A^ii was as described herein. Results of several 

representative E. coli expression experiments are shown in Table 
32. All materials shown in Table 32 were purified from the media 
25 fraction of the cultures. The predicted native molecular weight is 
approximately 900 kD as described herein. The purity of the 
samples, the amount of TcbA relative to contaminating proteins, 
varied with each preparation. 
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Bioassay Activity 



Table 33 

and immun nblot Analysis of TcbA and Derivatives 



Produced in E. c nli and Purified from the Culture Media 



Plasmia 


coding 
Region 


E. 

coli 
Strain 


soutnern corn 
Rootworm Bioassay 
Activity 


peptides 
Detected 
by 

Immunoblot 


Micrograms 
Protein 
Applied to 
Diet 








* Growtn 
Inhibit . 


% 

Mortal . 






pDAB^OJO 


tcbA 


BUI 

(DE3) 




++ + 


TcbA + 

TcbA ii: ^ 


1-8 


pDAB^Oii 


tcbA 


BUI 

(DE3) 




+++ 


TcbA + 
TcbAiii 


1-10 




tcbA 


BUI ' 




+++ 


TcbA + 
TcbAiii 


1-2 


pDAB^U 




HMS174 

(DE3) 


+++ 


-4- 


A ' COA ii A iii 
+ TcbA^^i 


13-27 



Rootworm as compared to control samples; 5-24%="+", 25-49%="++", 
50-100%="+++" . 



Example 2P 

10 Character ization of Toxin Peptides with Matrix-Assisted Laser 

sorption Ionization Time-of- Flight Mass Spectroscopy 

Toxins isolated from W-14 broth were purified as described in 
Example 15. In some cases, the TcaB protein toxin was pretreated 

15 with proteases (Example 16) that had been isolated from W-14 broth 
as previously described (Example 15) , Protein molecular mass was 
determined using matrix-assisted laser desorption ionization time- 
of-flight mass spectroscopy, hereinafter MALDI-TOF, on a VOYAGER 
BIOSPECTROMETRY workstation with DELAYED EXTRACTION technology 

20 (PerSeptive Biosystems, Framingham, MA). Typically, the protein of 
interest — tTOO-500 pmoles in 5 /il) was mixed with 1 /ul of 
acetonitrile and dialyzed for 0 . 5 to 1 h on a Miti-ipore VS filter 
having a pore size of 0.025 (Millipore Corp. Bedford, MA). 
Dialysis was performed by floating the filter on water (shinny side 

2 5 up) followed by adding protein-acetonitrile mixture as a droplet to 
the surface of the filter. After dialysis, the dialyzed protein 
removed using a pipette and was then mixed with a matrix consisting 
of sinapinic acid and trif luoroacetic acid according to 
manufacturers instructions. The protein and matrix were allowed to 

30 co-crystallize on a about 3 cm 2 gold-plated sample plate 

(PerSeptive Corp.) . Excitation of the crystals and subsequent mass 
analysis was performed using the following conditions: laser 
setting of 3050; pressure of 4.55e-07 ; low mass gate of 1500.0; 
negative ions off; accelerating voltage of 25,000; grid voltage of 
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90,0%; guide^Pre voltage of 0.010%; lineai^Wode; and a pulse delay 
time of 350 ns. 

Protein mass analysis data are shown in Table 33 . The data 
obtained from MALDI-TOF was compared to that hypothesized from gene 
sequence information and as previously determined by SDS-PAGE. 



10 



15 



20 



25 



Table 33 

Molecular Analysis of Peptides by MALDT-T QF. SDS-PAGE and Predicted 
Determination Based on Gene Sequence 



Peptide 


Predicted (Gene). 


SPS PAGE 


MALDI-TOF 


TcbA 


280,634 Da 


240,000 Da 


281, 040 


Da 


TcbAi/ii 


217,710 Da 


not resolved 


216, 812 


Da 


TcbAii 


207 , 698 Da 


201,000 Da 


206, 473 


Da 


TcbAiii 


62,943 Da 


58,000 Da 


63,520 


Da 


TcdAii 


209,218 Da 


188,000 Da 


208, 186 


Da 


TcdAiii 


63,520 Da 


56,000 Da 


63, 544 


Da 


TcbAn Protease Generated 


201,000 Da 


216, 614 


Da^ 








215, 123 


Da~ 








210, 391 


Da^ 








208, 680 


Da A 


TcbAii Protease Generated 


56,000 Da 


64, 111 


Da 



"Data normalized TcbA, multiple fragments observed at TcbAi/ii 



30 Example 21 

Production of Peptide Specific polyclonal Antibodies 

Nine peptide components of the W-14 toxin complex, namely, 
TcaA, TcaAiii, TcaBi, TcaBii, TcaC, TcbAii, TcbAiii, TcdAii, and 

35 TcdAiii were selected as targets against which antibodies were 

produced. Comprehensive DNA and deduced amino acid sequence data 
for these peptides indicated that the sequence homology between 
some of these peptides was substantial. If a whole peptide was 
used as the immunogen to induce antibody production, the resulting 

4 0 antibodies might bind to multiple peptides in the toxin 

preparation. To avoid this problem antibodies were generated that 
would bind specifically to a unique region of each peptide of 
interest. The unique region (subpeptide) of each target peptide 
was selected based on the analyses described below. 

4 5 Each entire peptide sequence was analyzed using MacVector™ 

Protein Analysis Tool (IBI Sequence Analysis Software, 
International Biotechnologies, Inc., P. 0. Box 9558, New Haven, CT 
06535) to determine its antigenicity index. This program was 
designed to locate possible externally-located amino acid 
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sequences, i.e., r ns that might be antigenic ^^es . This 
method combined information from hydrophilicity , surface 
probability, and backbone flexibility predictions along with the 
secondary structure predictions in order to produce a composite 
5 prediction of the surface contour of a protein. The scores for 
each of the analyses were normalized to a value between -1.0 and 
+1.0 (MacVector™ Manual). The antigenicity index value was 
obtained for the entire sequence of the target peptide. From each 
peptide, an area covering 19 or more amino acids that showed a high 

10 antigenicity index from the original sequence was re-analy2ed to 
determine the antigenicity index of the subpeptide without the 
flanking residues. This re-analysis was necessary because the 
antigenicity index of a peptide could be influenced by the flanking 
amino acid residues. If the isolated subpeptide sequence did not 

15 maintain a high antigenicity index, a new region was chosen and the 
analysis was repeated. 

Each selected subpeptide sequence was aligned and compared to 

all seven target peptide sequences using MacVector alignment 
program. If a selected subpeptide sequence showed identity 

20 (greater than 20%) to another target peptide, a new 19 or more 

amino acid region was isolated and re-analyzed. Unique subpeptide 
sequences covering 19 or more amino acid showing high antigenicity 
index were selected from all target peptides. 

The sequences of seven subpeptides were sent to Genemed 

2 5 Biotechnology Inc. The last amino acid residue on each subpeptide 
was deleted because it showed no apparent effect on the 
antigenicity index. A cysteine residue was added to the N- terminal 
of each subpeptide sequence, except TcaBi-syn which contains an 
internal cysteine residue. The present of a cysteine residue 

30 facilitates conjugation of a carrier protein (KLH) . The final 

peptide products corresponding to the appropriate toxin peptides 
and SEQ ID NO.s are shown in Table 34. 
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Ta ble 3 4 

Amino Acid Sequences for Synthetic Peptides 



5 



SEQ 


ID No t 


Penide Amino Acid Seauence 


DJ 


TcaAi i ~ syn 


NH2- 


(C) LRGNSPTNPDKDGI FAQVA 


64 


TcaAiii -syn 


NH2- 


(C) YTPDQTPSFYETAFRSADG 


65 


TcaBi-syn 


NH2- 


HGQSYNDNNYCNFTLSINT 


66 


TcaBiii-syn 


NH2- 


(C)VDPKTLQRQQAGGDGTGSS 


67 


TcaC-syn 


NH2- 


(C) YKAPQRQEDGDSNAVTYDK 


68 


TcbAii-syn 


NH2- 


(C) YNENPSSEDKKWYFSSKDD 


69 


TcbAiii-syn 


NH2- 


(C) FDSYSQLYEENINAGEQRA 


70 


TcdAii-syn 


NH2- 


(C)NPNNSSNKLMFYPVYQYSGNT 


71 


TcdAiii-syn 


NH2- 


(C) VSQGSGSAGSGNNNLAFGAG 



15 

Each conjugated synthetic peptide was injected into two 
rabbits according to Genemed accelerated program. The pre- and 
post -immune sera were available for testing after one month. 

The preliminary test of both pre- and post -immune sera from 
20 each rabbit was performed by Genemed Biotechnologies Inc. Genemed 
reported that by using both ELISA and Western blot techniques, they 
detected the reaction of post -immune sera to the respective 
synthetic peptides. Subsequently, the sera were tested with the 
whole target peptides, by Western blot analysis. Two batches of 

2 5 partially purified PhotorhaJbdus strain W-14 toxin complex was used 

as the antigen. The two samples had shown activity against the 
Southern corn rootworm. Their peptide patterns on an SDS-PAGE gel 
were slightly different. 

Pre-cast SDS-polyacrylamide gels with 4-20% gradient 
30 (Integrated Separation Systems, Natick, MA 01760) were used. 
Between 1 to 8 f/g of protein was applied to each gel well . 
Electrophoresis was performed and the protein was electroblotted 

onto Hybond-ECL™ nitrocellulose membrane (Amersham International) . 
The membrane was blocked with 10% milk in TBST (25 mM Tris HC1 pH 

3 5 7.4, 13 6 mM NaCl , 2 . 7 mM KC1 , 0.1% Tween 20) for one hour at room 

temperature. Each rabbit serum was diluted in 10% milk/TBST to 
1:500. Other dilutions between 1:50 to 1:1000 were also used. The 
serum was added to the membrane and placed on a platform rocker for 
at least one hour. The membrane was washed thoroughly with the 

4 0 blocking solution or TBST . A 1:2000 dilution of secondary 

antibodies (goat anti -mouse IgG conjugated to horse radish 
peroxidase; BioRad Laboratories) in 10% milk/TBST was applied to 
the membrane placed on a platform rocker for one hour. The 
membrane was subsequently washed with excess amount of TBST. The 
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15 



in was performed by using s^^mZl 



detection of the pi^^in was performed by using a^^TL (Enhanced 
Chemi luminescence) detection kit (Amersham International) . 

Western blot analyses were performed to identify binding 
specificity of each ant i- synthetic peptide antibodies. All 
5 synthetic polyclonal antibodies showed specificity toward to 

processed and, when applicable, unprocessed target peptides from 
protein fractions derived from Photorhabdus culture broth. -Various 
antibodies were shown to recognize either unprocessed or processed 
recombinant proteins derived from heterologous expression systems 
10 such as bacteria or insect cells, using baculovirus expression 

constructs. In one case, the anti -TcbAiii -syn antibody showed some 
cross-reactivity to anti-TcdAiii peptide. In a second case, the 
anti -TcaC-syn antibody, recognized an unidentified 190 kDa peptide 
in W-14 toxin complex fractions. 



Example 22 

Characterization of Photorhabdus Strains 



In order to establish that the collection described herein was 

20 comprised of Photorhabdus strains, the strains herein were assessed 
in terms of recognized microbiological traits that are 
characteristic of the bacterial genus Photorhabdus and which 
differentiate it from other Enterobacteriaceae and Xenorhabdus spp . 
(Farmer, J. J. 1984. Bergey's Manual of Systemic Bacteriology, Vol 

25 1. pp. 510-511. (ed. Kreig N. R. and Holt, J. G . ) . Williams & 

Wilkins, Baltimore.; Akhurst and Boemare, 1988, J. Gen. Microbiol. 
134, 1835-1845; Forst and Nealson, 1996. Microbiol. Rev. 60, 21- 
43). These characteristic traits are as follows: Gram stain 
negative rods, organism size of 0.3-2 /im in width and 2-10 /im in 

30 length [with occasional filaments (15-50 nm) and spheroplasts] , 

yellow to orange/red colony pigmentation on nutrient agar, presence 
of crystalline inclusion bodies, presence of catalase, inability to 
reduce nitrate, presence of bioluminescence , ability to take up dye 
from growth media, positive for protease production, growth at 

35 temperatures below 37°C, survival under anaerobic conditions and 
positively motile. (Table 33). Test methods were checked using 
reference Escherichia coli, Xenorhabdus and Photorhabdus strains. 
The overall results are consistent with all strains being part of 
the family Enterobacteriaceae and the genus Photorhabdus. Note 

4 0 that DEP1, DEP2 , and DEP3 refer to Photorhabdus strains obtained 
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from the Am an Type Culture Collection, ^2301 Parklawn Drive, 
Rockville, MD 20852 USA {#29304, 29999 and 51583, respectively). 

A luminometer was used to establish the bioluminescence 
associated with these Photorhabdus strains. To measure the 
5 presence or absence of relative light emitting units, the broths 
from each strain (cells and media) were measured at three time 
intervals after inoculation in liquid culture (24, 48, 72 hr) and 
compared to background luminosity (uninoculated media) . Several 
Xenorhabdus strains were tested as negative controls for 

10 luminosity. Prior to measuring light emission from the various 

broths, cell density was established by measuring light absorbance 
(560 nM) in a Gilford Systems (Oberlin, OH) spectrophotometer using 
a sipper cell. The resulting light emitting units could then be 
normalized to density of cells. Aliquots of the broths were placed 

15 into 96-well microtiter plates (100 fil each) and read in a Packard 
Lumicount'" luminometer (Packard Instrument Co., Meriden, CT) . The 
measurement period for each sample was 0.1 to 1.0 second. The 
samples were agitated in the luminometer for 10 sec prior to taking 
readings. A positive test was determined as being about 5-fold 

2 0 background luminescence (about 1-15 relative light units) . In 
addition, degree of colony luminosity was confirmed with 
photographic film overlays and by eye, after visual adaptation in a 
darkroom. The Gram's staining characteristics of each strain were 
established with a commercial Gram's stain kit (BBL, Cockeysville , 

2 5 MD) used in conjunction with Gram's stain control slides (Fisher 
Scientific, Pittsburgh, PA) . Microscopic evaluation was then 
performed using a Zeiss microscope (Carl Zeiss, Germany) 100X oil 
immersion objective lens (with 10X ocular and 2X body 
magnification) . Microscopic examination of individual strains for 

30 organism size, cellular description and inclusion bodies (the 

latter two observations after logarithmic growth) was performed 
using wet mount slides (10X ocular, 2X body and 40X objective 
magnification) and phase contrast microscopy with a micrometer 
(Akhurst, r. j. and Boemare, n. e. 1990. Entomopathogenic Nematodes 

35 in Biological Control (ed. Gaugler, R. and Kaya, H.). pp. 75-90. 
CRC Press, Boca Raton, USA.; Baghdiguian S., Boyer-Giglio M . H . , 
Thaler, J. 0., Bonnot G., Boemare N. 1993. Biol. Cell 79, 177- 
185.) . Colony pigmentation was observed after inoculation on Bacto 
nutrient agar, (Difco Laboratories, Detroit, MI) prepared as per 
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.^incubation occurred at 28 °C alB^d 



label instructions .^incubation occurred at 28°C aWFdescriptions 

were produced after 5 days. To test for the presence of the enzyme 
catalase, a colony of the test organism was removed on a small plug 
from a nutrient agar plate and placed into the bottom of a glass 
5 test tube. One ml of a household hydrogen peroxide solution was 
gently added down the side of the tube. A positive reaction was 
recorded when bubbles of gas (presumptive oxygen) appeared 
immediately or within 5 seconds. Controls of uninoculated nutrient 
agar and hydrogen peroxide solution were also examined. To test 
10 for nitrate reduction, each culture was inoculated into 10 ml of 
Bacto Nitrate Broth (Difco Laboratories, Detroit, MI) . After 24 
hours incubation with gentle agitation at 28°C, nitrite production 
was tested by the addition of two drops of sulfanilic acid reagent 
and two drops of alpha -naphthylamine reagent (see Difco Manual, 
15 10th edition, Difco Laboratories, Detroit, MI, 1984). The 

generation of a distinct pink or red color indicates the formation 
of nitrite from nitrate whereas the lack of color formation 
indicates that the strain is nitrate reduction negative. In the 
latter case, finely powdered zinc was added to further confirm the 
20 presence of unreduced nitrate; established by the formation of 

nitrite and the resultant red color. The ability of each strain to 
uptake dye from growth media was tested with Bacto MacConkey agar 
containing the dye neutral red; Bacto Tergitol-7 agar containing 
the dye bromothymol blue and Bacto EMB Agar containing the dye 
25 eosm-Y (formulated agars from Difco Laboratories, Detroit, MI, all 
prepared according to label instructions) . After inoculation on 
these media, dye uptake was recorded after incubation at 28°C for 5 
days. Growth on these latter media is characteristic for members 
of the family Enterobacteriaceae . Motility of each strain was 
30 tested using a solution of Bacto Motility Test Medium (Difco 

Laboratories, Detroit, MI) prepared as per label instructions. A 
butt -stab inoculation was performed with each strain and motility 
was judged macroscopically by a diffuse zone of growth spreading 
from the line of inoculum. The production of protease was tested 
3 5 by observing hydrolysis of gelatin using Bacto gelatin (Difco 
Laboratories, Detroit, MI) made as per label instructions. 
Cultures were inoculated and the tubes or plates were incubated at 
28°C for 5 days. Gelatin hydrolysis was then checked at room 
temperature, i.e. less than 22°C. To assess growth at different 
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10 



temperature^^^Lgar plates [2% proteose peptone #3 with two percent 
Bacto-Agar (Difco, Detroit, MI) in deionized water] were streaked 
from a common source of inoculum. Plates were incubated at 20, 28 
and 37°C for up to three weeks. The incubator temperature levels 
were checked with an electronic thermocouple and meter to insure 
valid temperature settings. Oxygen requirements for Photorhabdus 
strains were tested in the following manner. A butt-stab 
inoculation into fluid thioglycolate broth medium (Difco, Detroit, 
MI) was made. The tubes were incubated at room temperature for one 
week and cultures were then examined for type and extent of growth. 
The indicator resazurin demonstrates the presence of medium 
oxygenation or the aerobiosis zone (Difco Manual, 10th edition, 
Difco Laboratories, Detroit, MI) . Growth zone results obtained for 
the Photorhabdus strains tested were consistent with those of a 
facultative anaerobic microorganism. In the case of unclear 
results, the final agar concentration of fluid thioglycolate broth 
medium was raised to 0.75% and the growth characteristics 
rechecked . 
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Table 3 5 

Taxonomic Traits of Photorhabdus Strains 



strain 


A* 




"C 






T ■ 




H 


'I" 


TP — 1 


"FT 






N 


0 


P' 


0 


P. 

zealandica 


.1 


+ 


+ 


ra b 


+ 


- 


+ 


4- 




PO 


+ 






+ 


4- 


4- 


- ■ 


p. nepiaius 


- 


+ 


+ 


ra s 


+ 


- 


+ 


+ 




Y 


4- 


+ 




4> 


4- 


4- 


- 


HB-Arg 


- 


+ 




ra a 




- 


+ 


+ 




w 


+ 






+ 


4- 


+ 


- 


HJ3 uswego 


- 




+ 


rd y 


+ 


- 


+ 


+ 


+ 


w 






+ 


4- 


4- 


4- 


- 


hb Lewiston 


- 


+ 


+ 


rd' y 


+ 


- 




+ 


+ 


T 


+ 


+ 


+ 


4- 


4- 


4- 




K-L'22 


- 


+ 


+ 


ra b 


+ 


- 


+ 


4- 




Y 


4- 


+ 




+ 


+ 


4- 


- 


HMGD 




+ 


+ 


ra b 


+ 


- 


+ 


+ 




Rd 


4- 


+ 






4- 


+ 


- 


inaxcus 


- 


+ 




ra b 


+ 


- 




+ 


+ 


W 


+ 




+ 


+ 


4- 


4- 


_ 


GD 


- 




+ 


rd y 


+ 


- 


+ 


+ 


+ 


YT 






+ 


4- 


4> 


+ 


- 


PWH-b 


- 


4- 


+ 


rd y 


+ 


- 


+ 


+ 


+ 


Y 


+ 






4- 


+ 


4- 


- 


Megiaxs 


- 


+ 


+ 


ra b 


+ 


- 




+ 


+ 


R 


+ 




+ 


4- 


4- 


+ 




HF-Hb 


- 


+ 


4- 


rd y 




- 


+ 


+ 


4- 


R 


+ 




4- 


4- 


4- 


4- 


_ 


A . cows 


- 




+ 


rd y 


+ 


- 




+ 


+ 


PR 




+ 




4- 


+ 


4- 


_ 


MP1 


- 


+ 


+ 


rd S 


+ 


- 


+ 


+ 


+ 


T 




-»- 


+ 


4- 


-t- 


4- 


_ 


MP2 


- 


4- 


+ 


rd y 


+ 


- 


4- 


+ 


+ 


T 


+ 


+ 


+ 


4- 


4- 


+ 


- 




- 


+ 


+ 


rd y 


+ 


- 


4- 


+ 


+ 


R 






+ 


4- 


4- 


4- 


- 


MP4 


- 


+ 


+ 


ra b 


+ 


- 




-+• 




Y 




+ 


4- 


+ 


4- 


4- 


- 


MPb 


- 


+ 


+ 


rd y 


4- 








+ 


PR 








4- 


+ 


+ 


- 


GL9B 




+ 




rd y 


+ 






+ 


+ 


W 


+ 


+ 


+ 


4- 


4- 


4- 




UL1UI 




4- 


+ 


rd y 


+ 




+ 


+ 




W 


+ 


+ 


+ 


4- 


4- 


+ 




GL13B 




4- 


+ 


ra b 


+ 




+ 




+ 


■ w 


+ 




+ 




+ 


4- 








+ 




rd y 


+ 








4- 


w 


+ 




+ 


4- 


4- 


4- 




GL21 ; 




+ 


+ 


ra b 


+ 




+ 




+ 


Y 










4- 


4- 




UL2S7 






+ 


rd 5 






+ 


+ 




0 


4- 


+ 


+ 


4- 


4- 


4- 




DEP1 






+ 


rd y 








4- 


+ 


W 






4- 


4- 


4- 


4- 








+ 


4- 


rd y 






+ 


+ 


+ 


PR 






+ 


4- 


4- 


4- 




UEP3 




+ 


4- 


rd y 


+ 








+ 


CR 




+ 


4- 


4- 




4- 





5 *: A=Gram's stain, B=Crystaline inclusion bodies, 

C=Bioluminescence, D=Cell form, E=Motility, F=Nitrate reduction, 
G=Presence of catalase, H=Gelatin hydrolysis, I=Dye uptake, 
J=Pigmentation on Nutrient Agar (some color shifts after Day 5) , 
K=Growth on EMB agar, L=Growth on MacConkey agar, M=Growth on 
10 Tergitol-7 agar, N =Facultative anaerobe, 0=Growth at 20 °C, 
P=Growth at 28 °C, Q=Growth at 3 7°C. 

t: +=positive for trait, - ^negative for trait; rd=rod, S=sized 
within Genus descriptors. 

§: W = white, CR = cream, Y =yellow, YT=yellow tan, T=tan PO=pale 
X5 orange, 0=orange r PR=pale red, R=red. 

The evolutionary diversity of the Photorhabdus strains in our 
collection was measured by analysis of PCR (Polymerase Chain 
Reaction) mediated genomic fingerprinting using genomic DNA from 

20 each strain. This technique is based on families of repetitive DNA 
sequences present throughout the genome of diverse bacterial 
species (reviewed by Versalovic, J., Schneider, M . , DE Bruijn, 
F. J. and Lupski , J . R. 1994. Methods Mol . Cell. Biol., 5, 25-40). 
Three of these, repetitive extragenic palindromic sequence (REP), 

25 enterobacterial repetitive intergenic consensus (ERIC) and the BOX 
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role 



element are^BTought to play an important role in the organization 
of the bacterial genome. Genomic organization is believed to be 
shaped by selection and the differential dispersion of these 
elements within the genome of closely related bacterial strains can 
5 be used to discriminate these strains (e.g., Louws, F. J., 

Fulbright, D . W., Stephens, C. T. and DE Bruijn, F. J. 1994. Appl . 
Environ. Micro. 60, 2286-2295) . Rep-PCR utilizes oligonucleotide 
primers complementary to these repetitive sequences to amplify the 
variably sized DNA fragments lying between them. The resulting 

10 products are separated by electrophoresis to establish the DNA 
"fingerprint" for each strain. 

To isolate genomic DNA from our strains, cell pellets were 
resuspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0) to a 
final volume of 10 ml and 12 ml of 5 M NaCl was then added. This 

15 mixture was centrifuged 20 min. at 15,000 x g. The resulting 

pellet was resuspended in 5 . 7 ml of TE and 300 /il of 10% SDS and 60 
pil 20 mg/ml proteinase K (Gibco BRL Products, Grand Island, NY) 
were added. This mixture was incubated at 3 7°C for 1 hr, 
approximately 10 mg of lysozyme was then added and the mixture was 

20 incubated for an additional 45 mm. One milliliter of 5M NaCl and 
800 /il of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were then 
added and the mixture was incubated 10 min. at 65°C, gently 
agitated, then incubated and agitated for an additional 20 min. to 
aid in clearing of the cellular material. An equal volume of 

2 5 chloroform/ isoamyl alcohol solution (24; 1, v/v) was added, mixed 

gently then centrifuged. Two extractions were then performed with 
an equal volume of phenol /chloroform/ isoamyl alcohol (50:49:1). 
Genomic DNA was precipitated with 0.6 volume of isopropanol . 
Precipitated DNA was removed with a glass rod, washed twice with 

30 70% ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl 
pH8.0, 10 mM NaCl, 1 mM EDTA) . The DNA was then quant itated by 
optical density at 260 nm. To perform rep-PCR analysis of 
Photorhabdus genomic DNA the following primers were used, REP1R-I; 
5 ' -IIIICGICGICATCIGGC-3 ■ and REP2-I; 5 ' - ICGICTTATCIGGCCTAC-3 ' . PCR 

35 was performed using the following 25^1 reaction: 7.75 jil H2O, 2.5 

/il 10X LA buffer (PanVera Corp., Madison, WI) , 16 fil dNTP mix (2.5 
mM each), 1 ^1 of each primer at 50 pM/^1, 1 ^1 DMSO, 1.5 /il 
genomic DNA (concentrations ranged from 0.075-0.480 fig/fil) and 0.25 
111 TaKaRa EX Taq (PanVera Corp., Madison, WI). The PCR 
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amplification was p< 




ormed in a PerJcin Elmer DNA 




rmal Cycler 



10 



15 



20 



25 



30 



(Norwalk, CT) using the following conditions: 95°C/7 min. then 35 
cycles of; 94°C/1 min.,44°C/l min. , 65°C/B min. , followed by 15 min. 
at 65°C. After cycling, the 25 \i\ reaction was added to 5 pi of 6X 
gel loading buffer (0.25% bromophenol blue, 40% w/v sucrose in 
H2O) . A 15x20cm 1% -agarose gel was then run in TBE buffer {0.09 M 

Tris -borate, 0.002 M EDTA) using 8 fil of each reaction. The gel 
was run for approximately 16 hours at 45v. Gels were then stained 
in 20 /xg/ml ethidium bromide for 1 hour and destained in TBE buffer 
for approximately 3 hours . Polaroid^ photographs of the gels were 
then taken under UV illumination. 

The presence or absence of bands at specific sizes for each 
strain was scored from the photographs and entered as a similarity 
matrix in the numerical taxonomy software program, NTSYS-pc {Exeter 
Software, Setauket, NY). Controls of E. coli strain HB101 and 
Xanthomonas oryzae pv. oryzae assayed under the same conditions 
produced PCR fingerprints corresponding to published reports 
(Versalovic, J . , Koeuth, T. and Lupski , J. R. 1991. Nucleic Acids 
Res. 19, 6823-6831; Vera Cruz, C. M . , Halda-Alija, L . , Louws , F . , 
Skinner, D. Z., George, M. L. , Nelson, R. J., DE Bruijn, F . J . , 
Rice, C. and Leach, J. E . 1995. Int. Rice Res. Notes, 20, 23-24.; 
Vera Cruz, C. M . , Ardales, E. Y., Skinner, D. Z., Talag, J., 
Nelson, R. J., Louws, F. J . , Leung, H., Mew, T. W. and Leach, J. E. 
1996. Phytopathology 86, 1352-1359). The data from Photorhabdus 
strains were then analyzed with a series of programs within NTSYS- 
pc; SIMQUAL {Similarity for Qualitative data) to generate a matrix 
of similarity coefficients (using the Jaccard coefficient) and SAHN 
(Sequential, Agglomerat ive , Heirarchical and Nested) clustering 
[using the UPGMA (Unweighted Pair-Group Method with Arithmetic 
Averages) method] which groups related strains and can be expressed 
as a phenogram (Fig. 7) . The COPH (cophenetic values) and MXCOMP 
(matrix comparison) programs were used to generate a cophenetic 
value matrix and compare the correlation between this and the 
original matrix upon which the clustering was based. A resulting 
normalized Mantel statistic (r) was generated which is a measure of 
the goodness of fit for a cluster analysis (r=0.8-0.9 represents a 
very good fit). In our case r=0.924. Therefore, the collection is 
comprised of a diverse group of easily distinguishable strains 
representative of the Photorhabdus genus. 
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Example 23 

Insect iciflal Utility oi Toxin Lai Produced 
by Various Phgtortotafag strains 

5 

Initial "storage" cultures of the various Photorhabdus strains 
were produced by inoculating 175 ml of 2% Proteose Peptone #3 (PP3) 
(Difco Laboratories, Detroit, MI) liquid medium with a primary 
variant colony in a 500 ml tribaffled flask with a Delong neck, 

10 covered with a Kaput closure. After inoculation, the flask was 

incubated for between 24-72 hrs at 28°C on a rotary shaker at 150 
rpm, until stationary phase was reached. The culture was 
transferred to a sterile bottle containing a sterile magnetic stir 
bar and the culture was overlayered with sterile mineral oil, to 

15 limit exposure to air. The storage culture was kept in the dark, 
at room temperature. These cultures were then used as inoculum 
sources for the fermentation of each strain. 

"Seed" flasks or cultures were produced by either inoculating 
2 mis of an oil overlayered storage culture or by transferring a 

2 0 primary variant colony into 175 ml sterile medium in a 500 ml 

tribaffled flask covered with a Kaput closure. (The use of other 
inoculum sources is also possible.) Typically, following 16 hours 
incubation at 2 8 °C on a rotary shaker at 150 rpm, Che seed culture 
was transferred into production flasks. Production flasks were 

2 5 usually inoculated by adding about 1% of the actively growing seed 
culture to sterile 2% PP3 medium (e.g. 2.0 ml per 175 ml sterile 
medium) . Production of broths occurred in 500 ml tribaffled flasks 
covered with a Kaput. Production flasks were agitated at 28°C on a 
rotary shaker at 150 rpm. Production fermentations were terminated 

30 after 24-72 hrs although successful fermentation is not confined to 
this time duration. Following appropriate incubation, the broths 
were dispensed into sterile 1.0 L polyethylene bottles, spun at 
2600xg for 1 hr at 10 °C and decanted from the cell and debris 
pellet. Further broth clarification was achieved with a tangential 

35 flow microf iltration device {Pall Filtron, Northborough, MA) using 
a 0.5 open-channel poly-ether sulfone (PES) membrane filter. 
The resulting broths were then concentrated (up to 10 -fold) using a 
10,000 or 100,000 MW cut-off membrane, M12 ultra- filtration device 
(Amicon, Beverly MA) or centrifugal concentrators (Millipore, 

4 0 Bedford, MA and Pall Filtron, Northborough, MA) with a 10,000 or 
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100,000 MW pore si 



m 



In the case of centri 



fugal' 




.centrators, 



the broth was spun at 2000xg for approximately 2 hr . The membrane 



desired concentration of components greater than the pore size 
5 used. Following these procedures, the broth was used for 

biochemical analysis or filter sterilized using a 0.2 fiM cellulose 
nitrate membrane filter for biological assessment. Heat 
inactivation of processed broth samples was achieved by heating the 
samples at 100°C in a sand-filled heat block for 10 minutes. 

10 The broth (s) and toxin complex (es) from different Photorhabdus 

strains are useful for reducing populations of insects and were 
used in a method of inhibiting an insect population which comprises 
applying to a locus of the insect an effective insect inactivating 
amount of the active described. A demonstration of the breadth of 

15 insecticidal activity observed from broths of a selected group of 

Photorhabdus strains fermented as described above is shown in Table 
36. It is possible that improved or additional insecticidal 
activities could be detected with these strains through increased 
concentration of the broth or by employing different fermentation 

20 methods. Consistent with the activity being associated with a 

protein, the insecticidal activity of all strains tested was heat 
labile. 

Culture broth (s) from diverse Photorhabdus strains show 
differential insecticidal activity (mortality and/or growth 

2 5 inhibition) against a number of insects. More specifically, the 
activity is seen against corn rootworm which is a member of the 
insect order Coleoptera . Other members of the Coleoptera include 
boll weevils, wireworms , pollen beetles, flea beetles, seed beetles 
and Colorado potato beetle. The broths and purified toxin 

30 complex (es) are also active against tobacco budworm, tobacco 

hornworm and European corn borer which are members of the order 
Lepidoptera. Other typical members of this order are beet 
armyworm, cabbage looper, black cutworm, corn earworm, codling 
moth, clothes moth, Indian mealmoth, leaf rollers, cabbage worm, 

35 cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm and 
fall armyworm. Activity is also observed against German cockroach 
which is a member of the order Dictyoptera (or Blattodea) . Other 
members of this order are oriental cockroach and American 
cockroach . 



permeate was added to the corresponding retentate to achieve the 
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ic^^gainst corn rootworm larvae^^s 



Activic^Kigainst corn rootworm larvae was tested as follows. 
Photorhabdus culture broth(s) (10 fold concentrated, filter 
sterilized), 2% Proteose Peptone #3 (10 fold concentrated), 
purified toxin complex (es) , 10 mM sodium phosphate buffer, pH 7.0 
5 were applied directly to the surface (about 1.5 cm 2 ) of artificial 
diet (Rose, R. I- and McCabe, J. M. 1973. J. Econ. Entomol . 66, 
398-400) in 40 fil aliquots. Toxin complex was diluted in 10 mM 
sodium phosphate buffer, pH 7.0. The diet plates were allowed to 
air-dry in a sterile flow-hood and the wells were infested with 
10 single, neonate Diabrotica undecimpunctata howardi (Southern corn 
rootworm, SCR) hatched from surface sterilized eggs. The plates 
were sealed, placed in a humidified growth chamber and maintained 
at 27°C for the appropriate period (3-5 days) . Mortality and 
larval weight determinations were then scored. Generally, 16 
15 insects per treatment were used in all studies. Control mortality 
was generally less than 5%. 

Activity against lepidopteran larvae was tested as follows. 
Concentrated (10-fold) Photorhabdus culture broth (s) , control 
medium (2% Proteose Peptone #3) , purified toxin complex(es) , 10 mM 
20 sodium phosphate buffer, pH 7.0 were applied directly to the 

surface (about 1.5 cm 2 ) of standard artificial lepidopteran diet 
(Stoneville Yellow diet) in 40 /il aliquots. The diet plates were 
allowed to air-dry in a sterile flow-hood and each well was 
infested with a single, neonate larva. European corn borer 
25 {Ostrinia nubilalis) and tobacco hornworm {Manduca sexta) eggs were 
obtained from commercial sources and hatched in-house, whereas 
tobacco budworm {Heliothis virescens) larvae were supplied 
internally. Following infestation with larvae, the diet plates 
were sealed, placed in a humidified growth chamber and maintained 
30 in the dark at 27°C for the appropriate period. Mortality and 

weight determinations were scored at day 5. Generally, 16 insects 
per treatment were used in all studies . Control mortality 
generally ranged from about 0 to about 12.5% for control medium and 
was less than 10% for phosphate buffer. 
35 Activity against cockroach was tested as follows. Concentrated 

(10 -fold) Photorhabdus culture broth (s) and control medium (2% 
Proteose Peptone #3) were applied directly to the surface (about 
1.5 cm 2 ) of standard artificial lepidopteran diet (Stoneville 
Yellow diet) in 40 fil aliquots. The diet plates were allowed to 
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air-dry in a steril 



ow-hood and each well was 



• 



ted with a 



single, C0 2 anesthetized first instar German cockroach {Blatella 
germanica) . Following infestation, the diet plates were sealed, 
placed in a humidified growth chamber and maintained in the dark at 
5 27°C for the appropriate period. Mortality and weight 

determinations were scored at day 5. Control mortality less than 
10% . 
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Table 3$ 

Observed Inaect icidal Spectrum of Broths 

from Different; PhotQrhabdus Strains 
5 FnotornaDaus strain sensitive* insect species 





P . zealandica 


1 * * 


, 2, 




P. hepialus 


1, 


2, 4 




HB-Arg 


1 / 


2 , 4 




HB Oswego 


1, 


2 , 4 


10 


HB Lewi st on 


1* 


2, 4 




K-122 


1, 


4 




HMGD 


1, 


4 




Indicus 


1, 


2, 4 




GD 


2, 


4 


15 


PWH-5 


1, 


2, 4 




Megidis 


1, 


2, 4 




HF-85 


1, 


2, 4 




A . Cows 


1, 


4 




MP1 


1 , 


2, 4 


20 


MP2 


1, 


2, 4 




MP3 


4 






MP4 


1, 


4 




MP5 


4 






GL98 


1 , 


4 


25 


GL101 


1, 


4, 5 




GL138 


1, 


2, 4 




GL155 


1 , 


4 




GL217 


1, 


2, 4 




GL257 


1, 


4 


30 


DEP1 


1, 


4 




DEP2 


1, 


2, 3 




DEP3 


4 





* = 3 25% mortality and/or growth inhibition vs. control 
** = 1; Tobacco budworm, 2; European corn borer, 3; 
35 Tobacco hornworm, 4; Southern corn rootworm, 5; 

German cockroach. 
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Example .24 

Southern Anal ysis of Non-W-14 Photorhabdus Strains 
tJgina W-14 Gene Probes 

5 Photorhabdus strais were grown on 2% proteose peptone #3 agar 

(Difco Laboratories, Detroit, MI) and insecticidal toxin competence 
was maintained by repeated bioassay after passage. A 50 ml shake 
culture was produced in 175 ml baffled flasks in 2% proteose 
peptone #3 medium, grown at 28° and 150 rpm for approximately 24 

10 hours. Fifteen ml of this culture were centrifuged (700 x g, 3 0 
min) and frozen in its medium at -20° until it was thawed (slowly 
in ice water) for DNA isolation. The thawed W-14 culture was 
centrifuged (900 x g, 15 min 4°), and the floating orange 
mucopolysaccharide material was removed. The remaining cell 

15 material was centrifuged (25,000 x g, 4°) to pellet the bacterial 
cells, and the medium was removed and discarded. 

Total DNA was isolated by an adaptation of the CTAB method 
described in section 2.4.1 of Ausubel et al . (1994). The 
modifications included a high salt shock, and all volumes were 

20 increased ten-fold over the "miniprep" recommended volumes. All 
centrif ugat ions were at 4°C unless otherwise specified. The 
pelleted bacterial cells were resuspended in TE buffer (10 mM Tris- 
HC1, 1 mM EDTA, pH 8 ) to a final volume of 10 ml, then 12 ml 5 M 
NaCl were added; this mixture was centrifuged 20 min at 15,000 x g. 

2 5 The pellet was resuspended in 5.7 ml TE , and 3 00 ptl of 10% SDS and 
60 /il of 20 mg/ml proteinase K (in sterile distilled water, Gibco 
BRL Products, Grand Island, NY) were added to the suspension. The 
mixture was incubated at 37°C for 1 hr; then approximately 10 mg 
lysozyme (Worthington Biochemical Corp., Freehold, NJ) were added. 

30 After an additional 45 min incubation, 1 ml of 5 M NaCl and 800 j/1 
of CTAB /NaCl solution (10% w/v CTAB, 0.7 M NaCl) were added. This 
preparation was incubated 10 min at 65°C, then gently agitated and 
further incubated and agitated for approximately 20 min to assist 
clearing of the cellular material. An equal volume of 

35 chlorof orm/isoamyl alcohol solution (24:1, v:v) was added, mixed 

very gently, and the phases separated by centrif ugation at 12,000 x 
g for 15 min. The upper (aqueous) phase was gently removed with a 
wide-bore pipette and extracted twice as above with an equal volume 
of PCI (phenol /cholorof orm/ isoamyl alcohol; 50:49:1, v:v:v; 

40 equilibrated with 1M Tris-HCl, pH 8.0; Intermountain Scientific 

Corporation, Kaysville, UT) . The DNA precipitated with 0.6 volume 
of isopropanol was gently removed on a glass rod, washed twice with 
70% ethanol, dried, and dissolved in 2 ml STE (10 mM Tris-HCl, 10 
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mM NaCl, 1 



IDTA, pH 8) . This preparati 




contained 2 . 5 mg/ml 



DNA, as determined by optical density at 260nm. 

Identification of Bal I I /Hind III Fragments Hybridizing to tc-gene 

5 Specific Probes 

Approximately 10 jig of genomic DNA was digested to completion 
with about 30 units each of Bgl II and Hind III (NEB) for 18 0 min, 
frozen overnight, then heated at 65°C for five min, and 

electrophoresed in a 0.8% agarose gel (Seakem® LE, IX TEA, 80 
10 volts, 90 min) . The DNA was stained with ethidium bromide (50 
/ig/ml) as described earlier, and photographed under ultraviolet 
light. The DNA fragments in the agarose gel were subjected to 
depurination (5 min in 0.2 M HC1) , denaturation (15 min in 0.5 M 
NaOH , 1.5 M NaCl), and neutralization (15 min in 0.5 M Tris HC1 pH 
15 8.0, 1.5 M NaCl), with 3 rinses of distilled water between each 
step. The DNA was transferred by Southern blotting from the gel 
onto a NYTRAN nylon membrane (Amersham, Arlington Heights, ID 
using a high salt (20X SSC) protocol, as described in section 2.9 
of Ausubel et al . (CPMB, op. cit . ) . The transferred DNA was then 
20 UV-crosslinked to the nylon membrane using a Stratagene UV 

Stratalinker set on auto crosslink. The membranes were stored dry 
at 25°C until use. 

Hybidization was performed using the ECL™direct {Amersham, 
Arlington Heights, ID labeling and detection system following 

2 5 protocols provided by the manufacturer. In brief, probes were 

prepared by covalently linking the denatured DNA to the enzyme 
horseradish peroxidase. Once labeled the probe was used under 
hybridization conditions which maintain the enzymatic activity. 
Unhybridized probe was removed by two gentle washes 20 minutes each 
30 at 4 2°C in 0.5xSSC, 0.4% SDS, and 6M Urea. This was followed by two 
washes 5 minutes each at room temperature in 2xSSC. As directed by 
the manufacturer, ECL™ reagents were used to detect the hybridizing 
DNA bands. There are several factors which influence the ability 
to detect gene relatedness between various Photorhabdas strains and 

3 5 strain W-14. First, high stringency conditions have not been 

employed in these hybridizations. It is known in the art that 
varying the stringency of hybridization and wash conditions will 
influence the pattern and intensity of hybridizing bands. Second, 
Southern blots' blot to blot variation will influence the mobility 
40 of hybridizing bands and molecular weight estamates . Therefore, w- 
14 was included as a standard on all Southern blots. 
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Gene specific 




'obes derived from the W-14 



n genes were 



used in these hybridizations. The following lists the specific 
coordinates within each gene sequence to which the probe 
corresponds. A probe specific for tcaB s /B u : 1174 to 3642 of 
5 Sequence ID #25, a probe specific for tcaC: 3637 to 6005 of 
Sequence ID #25, a probe specific for tcbA: 2097 to 4964 of 
Sequence ID #11, and a probe specific for tcdA: 1660 to 4191 of 
sequence ID #46. The following tables summarize Southern Blot 
analyses of Photorhabdus strains. In the event that hybridization 
10 of probes occurred, the hybridized fragment (s) were noted as either 
identical or different from the pattern observed for the W-14 
strain . 



-139- 



SUBSTUUTE SHEET (RULE 26) 



3DOCID:<WO 9808932A1 I > 



WO 98/08932 



PCT/US97/07657 



Table 37 

Southern Analysis oi Photozhabdus Strains 



strains 


tcdA 


CCJDA 


fccaC 




wx- 1 


D 


D 


D 


JJ 


WX-2 


D 






U 


WX-3 


D 


u 


U 


u 


WX-4 


D 


V 


Isiu 


D 


WX - b 




D 


D 


U 


WX - b 


D 


i) 


U 


JJ 


WX- 7 


D 


D 


NU 


b 


wx-« 


U 


u 


U 


D 


wx-y 


ND 


jj 


u 


D 


WX-IU 


ND 


JJ 


u 


D 


WX-11 


ND 


D 


u 


D 


WX-12 


D 


JJ 


JJ 


D 


WX-14 


D 


U 


JJ 


D 


WX-15 


D 


u 


U 


rr 


HPBb 


D 




U 


D 


Hm 


D 




u 


D 11 ' 




u 




D 




H9 


U 




I 


D 




D 




U 




NC-1 " 


D 




D 


D 


W1K 


D 




D 


U 


WiO 


U 


D 


U 


D 


W-14 


I 


1 


1 


I 



product ; 

I = Identical fragment pattern; D = Different fragment pattern. 
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Table., 3 fi 

finuthern Analysis of Photorhabdus Strains 



strains 




tCJD/i 


tcaC 




VL-L22 


3. 3,2. a 


D 


_ 




PWH-b 


+ 


D 


u 


- 


maicus 


u 


D 


J. 1) 


1 


Megiais 


u 


U 


u 


_ 


GU 


u 


D 


1J 


_ 


HF-Bb 


u 


D 


U 




MP J 


D 


_ 


D 


1 


MP 1 


u 




u 


- 


A. UOWS 


D 


+ 


D 




HB-Arg 


D 


WD 


D 






L) 


u 


D 




HB Lewiston 


U 


D 


D 




hb uswego 


D 


D 


u 




W-14 


I 


I 


1 


I 



I = Identical fragment pattern; D = Different fragment .pattern. 
+ = Hybridization fragment pattern not determined. 
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Table .39 

Southern Analysis, of PhetQrhabdus Strains 



scrams 



tcdA 



CCD A 



UL101 



ccac 



trcaB i /B ii 



TP 
TT 



CE2T7 



D 



TT 



"MF5" 



p nepiaius 
f zeaianaia 



TT 



DEP1 



11.0 



TSEFT 



W-14 



J 7572 .U 



2 . b 



2TB 



NU = Not determined; - = no detectable hyoridizat ion product ; 
5 I = Identical fragment pattern; D = Different fragment pattern. 
+ = Hybridization fragment pattern not determined. 

From these analyses it is apparent that homologs of W-14 genes 
are dispersed throughout these diverse Photorhabdus strains, as 
10 evidenced by differences in gene fragment sizes between W-14 and 
the other strains. 

Example 2 5 

N-Terminal Amino Acid Sequences of Toxin Complex Peptides from 
15 Different Photorhabdus Strains 

The relationship of peptides isolated from different 
Photorhabdus strains, as described in Example 14, were subjected to 

-142- 

SUBSTTTUTE SHEET (RULE 26) 

BNSDOCID: <WO 9808932A1 J_> 



WO 98/08932 



PCT/US97/07657 



N-terminal amino ac sequencing. The N-terminal no acid 
sequences of toxin peptides in several strains were compared to W- 
14 toxin peptides. In Table 40, a comparison of toxin peptides 
compared to date showed that identical or homologous (at least 4 0% 
5 similarity to W14 gene/peptides ) toxin peptides were present in all 
of the strains. For example, the N-terminal amino acid sequence of 
TcaC, SEQ ID NO: 2, was found to be identical to that for 160 kDa 
peptide in HP88 but also homologs were present in strains WIR, H9, 
Hb, WX-1, and Hm. Some W-14 peptides or homologs have not been 

10 observed in other strains; however, not all peptides have been 

sequenced for toxin complexes from other strains due to N-terminal 
blockage or low abundance. In addition, many other N-terminal 
amino acid sequences (SEQ ID NOS: 82 to 88) have been obtained for 
toxin complex peptides from other strains that have no similarity 

15 to peptides from W-14 and in some case were identical to each 

other. For example, an identical amino acid sequence, SEQ ID NO: 
82, was obtained for 64 kDa peptide present in both HP88 and Hb 
strains and a homologous sequence for a 70 kDa peptide in NC-1 
strain (SEQ ID NO: 83) . 
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Table 40 

A Comparison of Amino Terminal Sequence Homology Between Proteins 

Isolated From Non-W-14 Strains 

5 





— vrrm 




st,u ID 




strain 


identical 




Homology ] 


Peptide 


Gene 


SEQ ID 


NO: 












TcaAii 


tea A 


15 














TcaAiii 


tea A 


4 














TcaBi 


ZCatS 


o 


"7 C 










f h Kua 








76 




Hm 


_ 




71 kDa 


TcaBii 


tcaB 


5 






H9 


61 kDa 




- 












Hm 


61 kDa 




_ 


TcaC 


tea A 


2 


72 




Hb 


- 




160 kDa 












HP88 


160 kDa 




tt 








73 




WIR 






170 kDa 








74 




H9 


- 




180 kDa 








75 




Hm 


- 




170 kDa 








80 




WX-1 


- 




170 kDa 


TcbAii 


tcbA 


1 














TcbAiii 


tcbA 


40 














TccA 


tccA 


8 


77 




Hb 


- 




81 kDa 


TccB 


tccB 


7 






WX-1 


170 kDa 
















WX-2 


180 kDa 
















WX-14 


180 kDa 
















WIR 


170 kDa 












78 




H9 






170 kDa 












NC-1 


140 kDa 












79 




Hm 






190 kDa 


TcdAii 


tcdA 
















TcdAiii 


tcdA 


41 






Hb 


57 kDa 












81 




H9 






69 kDa | 






9 






Hb 


86 kDa 
















HP88 


8 6 kDa 






homology 


reters to ammo 


acid sequences tn 


at were at 


1 


easz k U 6 


similarity to W14 


gene / 


peptides . 


Similar residues 


were 


identified as being a member in 


one of the 


following 


five 


groups : 


(Pf A, G, 


S r T) ; 


(Q, N , 


E , 


B, D, Z) ; (H, K, 


R) 


; ( 1 * if 


V, M ) ; and (F, Y , 


W) . 















Example. jae 

Immunological Analysis of Photorhabdys Strains 

Culture broths of Photorhabdus strains were concentrated 10 to 
15 times using Centriprep-10 ultrafiltration device (Amicon, Inc. 
Beverly, MA 01915) . The concentration of the protein ranges from 
0.3 to 3.0 mg per ml. Ten to 20 ptg of total protein was loaded in 
each well of a precast 4-20% polyacrylamide gel (Integrated 
Separation Systems, Natick, MA 01760) . Gel electrophoresis was 
performed for 1.25 hours using a constant current set at 25 ma per 

gel. The gel was electro-blotted on to Hybond-ECL™ nitrocellulose 
membrane (Amersham Corporation, Arlington Hts, II 60005) using a 
semi -dry electro-blotter (Pharmacia Biotech Inc., Piscataway , NJ 
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OB 8 54) . A const an' 




xrent was applied at 0.75 



r cm for 2 . 5 



hours. The membrane was blocked with 10% milk in TBST (25 mM Tris 
HCl pH 7.4, 136 mM NaCl , 2 . 7 mM KC1 , 0.1% Tween 20) for one hour at 
room temperature. Each primary antibody was diluted in 10% 
5 milk/TBST to 1:500. Other dilution between 1:50 to 1:1000 was also 
used. The membrane was incubated in primary antibody for at least 
one hour. Then it was washed thoroughly with the blocking solution 
or TBST. A 1:2000 dilution of secondary antibodies (goat anti- 
mouse IgG or goat anti rabbit TgG conjugated to horseradish 

10 peroxidase; BioRad Laboratories, Hercules, CA 94547) in 10% 
milk/TBST was applied to the membrane which was placed on a 
platform rocker for one hour. The membrane was subsequently washed 
with excess amount of TBST. The detection of the protein was 
performed by using an ECL (Enhanced Chemi luminescence ) detection 

15 kit (Amersham International) . 

A panel of peptide specific-antibodies generated against W-14 
peptides were used to characterize the protein composition of 
broths from nine non-W-14 Photorhabdus strains using Western blot 
analysis. In addition, one monoclonal antibody (MAb-C5F2) which 

20 recognizes TcbA Ui protein in W-14-derived toxin complex was used. 
The results (Table 39) showed cross recognition of the antibodies 
to some of the proteins in these broths. In some cases, the 
proteins that were recognized by the antibodies were the same size 
as the W-14 target peptides. In other cases, the proteins that 

2 5 were recognized by the antibodies were smaller than the W-14 target 
peptides. This data indicate that some of the non-W-14 
Phozorhabdus strains may produce similar proteins to the W-14 
strain. The difference could be due to deletion or protein 
processing or degradation process. Some of the strains did not 

30 contain protein (s) that could be recognized by some antibodies, 
however, it is possible that the concentration is significantly 
lower than those observed for W-14 peptides. When compared for 
various toxin peptide homologs these results showed peptide 
diversity among the Photorhabdus strains. 



35 
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Table 41 

Cross Recognition by Monoclonal Antibodies or Polyclonal Antibodies 

Generated Against W-14 Peptides to Prptgin(s) in Broths ot Selected 

Non-W-14 PhPtoriiafrdug 



PtlO CO - 


l v LrtJJ 










rruJ 






JrMJj 


X i i cl JL/L2 LI 






x w \JLr\ 




i- v— an 




1 L an 


Tra 21 








i i — 


i i i - 


— syn 


i i - 


i i i — 




■I l _ 

X X 


"5 "i i — 

XXX 






svn 


svn 




svn 


syn 


syn 


syn 


syn 


"MPT 




+ 


+ 


+ 




■f 








"WFT" " 


+ 




+ 






-f. 


4. 


+ 




MPi 




+ 


+ 


+ 




NT 


+ 


+ 




A. COWS 






+ 


+ 




NT 


+ 


+ 




HJD-OSW 






NT 


+ 




NT 


+ 


+ 


+ 


H-Arg 




+ 








"NT 


+ 




+ 


Hb-leu 






-*- 


+ 




"NT 


+ 


+ 


+ 


maicus 


+ 


+ 




+ 


+ 


NT 


+ 


+ 








+ 








+ 


+ 






W-14 




+ 


+ 


+ 




+ 


+ 




■+ 



~ Posit lve react ion ; 



e reaction; 



Additional non-W-14 Photorhabdus strains were characterized by 
Western blot analysis using the culture broth and/or partial 
10 purified protein fractions as antigen. The panel of antibodies 

include MAb-C5F2, MAb-DEl (recognizing TcdA ii ), PAb-DE2 (recognizing 

TcaB) , PAb-TcbAii-syn, PAb- TcaC-syn, PAb TcaB ir syn, PAb-TcbA ii:L - 

syn, PAb - TcaB ±- syn . These antibodies showed cross-reactivity with 

proteins in the broth and in the partial purified fractions of non- 
15 W-14 strains. 

The data indicate that antibodies could be used to identify 
proteins in the broth as well as in the partially purified protein 
fractions . 
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Table 42 

Cross Rpmonition by Mon oclonal Antibodies or Plvclonal Antibodies 

gpn^ra^d Agai nst W-14 Peptides to Protein (s) in Broths and/or 
Partial Purified Protein Fractions of Selected Non-W14 Photorhabdus 



vnozo- 
rhabdus 
Strain 


Monoclonal 
Antibodies 


Polyclonal AntiDoaies 




Mab 
C5F2 


Mar>- 
DE1 


PAb- 
DE2 


PAb 
TcbA u 
-syn 


PAb 
TcaC- 
syn 


PAb 
TcaB u 
-syn 


PAb- 
TcbA Ui 
-syn 


PAb- 
TcaBi 
-syn 


WX-1 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


WX-2 


+ 


+ 




+ 


+ 


-+■ 


NT 


+ 


WX-J 


+ 


NT 




NT 


NT 


NT 


NT 


NT 


WX-b 


+ 


NT 


+ 


NT 


NT 


NT 


NT 


NT 


WX-b 


+ 


NT 


NT 


NT 


NT 


NT 


NT 


NT 


wx-v 




+ 


+ 


+ 


+■ 


+ 


NT 


+ 


wx-e 


+ 


NT 


NT 


NT 


NT 


NT 


NT 


NT 


WX-9 


+ 


NT 


NT 


NT 


NT 


NT 


NT 


NT 


WX-IU 




NT 


NT 


NT 


NT 


NT 


NT 


NT 


WX-12 




+ 


+ 


+ 


+ 


+ 


+ 


+ 


WX-14 




+ 




+ 


NT 




NT 


+ 


WX-lb 


+ 


NT 


NT 


NT 


NT 


NT 


NT 


NT 


WJO 


+ 


+ 


+ 


NT 


NT 


NT 


NT 


NT 


Kb 




NT 


+ 


NT 


+ 


NT 




+ 


H9 








NT 


+ 


+ 


NT 


NT 


hm 




NT 




+ 


+ 




NT 


+ + 


HFdb 




NT 


+ 




+ 








NL>i 


+ 




+ 




+ 


+ 


NT 


+ 


W1K 




NT 


+ 


+ 


+ 


+ 


+ 


+ 


W-14 


+ 


+ 


+ 


+ 




+ 




+ 



10 



Example 2i 

Bacterial Expression of the tcdA Coding, Region 



Engineering of the tcdA Gene for Bacterial Expression 

The 5' and 3' ends of the tcdA coding region (SEQ ID NO:46) 
were modified to add useful cloning sites for inserting the segment 

15 into heterologous expression vectors. The ends were modified using 
unique primers in Polymerase Chain Reactions (PCR) , performed 
essentially as described in Example 8. Primer sets, as described 
below, were used in conjunction with cosmid 21D2.4 as template, to 
created products with the appropriately modified ends. 

2 0 The first primer set was used to modify the 5' end of the gene, 

to insert a unique Nco I site at the initiator codon using the 
forward primer A0F1 (5' GAT CGA TCG ATC CAT GGC CAA CGA GTC TGT AAA 
AGA GAT ACC TGA TG TAT TAA AAA GCC AGT GTG 3' ) and to add unique Bgl 
II, Sal I and Not I sites to facilitate insertion of the remainder 

2 5 of the gene using the reverse primer A0R1 (5' GAT CGA TCG TAC GCG 
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GCC GCT CG, 




:g atc gtc gac cca ttg att t< 




GAT CTG GGC GGC GGG TAT 



CCA GAT AAT AAA CGG AGT CAC 3 ' ) . 

Another PCR reaction was designed to modify the 3' end of the 
gene by adding an additional stop codon and convenient restriction 
5 sites for cloning. The forward primer A0F2 (5' ACT GGC TGC GTG GTC 
GAC TGG CGG CGA TTT ACT 3') was used to amplify across a unique Sal 
I site in the gene, later used to clone the modified 3' end. The 
reverse primer A0R2 (5' CGA TGC ATG CTG CGG CCG CAG GCC TTC CTC GAG 
TCA TTA TTT AAT GGT GTA GCG AAT ATG CAA AAT 3 ' ) was used to insert a 
10 second stop codon (TGA) and cloning sites Xho I, Stu I and Not I. 
Bacterial expression vector pET27b {Novagen, Madison, WI) , was 
modified to delete the Bgl II site at position 446, according to 
standard molecular biology techniques. 



15 (AOF1+AOR1) , to modify the 5» end of the gene, was ligated to the 
modified pET27b vector according to the supplier's instructions. 
The DNA sequences of the amplified portion of three isolates were 
determined using the supplier's recommended primers and the 
sequencing methods described previously. The sequence of all 

20 isolates was the same. 

One isolate was then used as a cloning vector to insert the 
middle portion of the tcdA gene on a 6 341 bp Bgi II to Sal I 
fragment. The resulting clone was called MC4 and contained all but 
the 3' most portion of the tcdA coding sequence. Finally, to 

2 5 complete the full-length coding region, the 83 2 bp PCR product from 
the second PCR amplification (AOF2+AOR2), to modify the 3' end of 
the gene, was ligated to isolate MC4 on a Sal I to Not I fragment, 
according to standard molecular biology techniques. The tcdA coding 
region was sequenced and found to be complete, the resulting plasmid 

30 is called pDAB2035. 

Construction of Plasmids .pPAB2Q35, PPAB2037 ^nd PPAB2Q3S tor 
Bacterial Expression of tcdA 

The tcdA coding region was cut from plasmid pDAB203 5 with 
35 restriction enzymes Nco I and Xho I and gel purified. The fragment 
was ligated into the Nco I and Xho I sites of the expression vector 
pET15 to create plasmid pDAB2036. Additionally, pDAB2035 was cut 
with Nco I and Not I to release the tcdA coding region which was 
ligated into the Nco I and Not I sites of the expression vector 
40 pET28b to create plasmid pDAB203 7 . Finally, plasmid pDAB2035 was 
cut with Nco I and Stu I to release the tcdA coding region. This 
fragment was ligated into the expression vector Trc99a which was cut 
with Hind III followed by treatment with T4 DNA polymerase to blunt 



The 497 bp PCR product from the first amplification reaction 
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the ends. The vec 




was then cut with Nco I an< 




■gated 



with the 



Nco I/Stu I cut tccZA fragment. The resulting plasmid is called 
pDAB2038 . 

5 Expression of tcd A from Plasmid PDAB2Q38 

Plasmid pDAB2038 was transformed into BL21 cells and expressed 
as described above for plasmid pDAB2033 in Example 19. 

purification of tcdA from E. coli 

10 The expression culture was centrifuged at 10,300 g for 30 min 

and the supernatant was collected. It was diluted with two volumes 
of H2O and applied at a flow rate of 7.5 ml/min to a poros 50 HQ 
(Perspective Systems, MA) column (1.6 cm x 10 cm) which was pre- 
equilibrated with 10 mM sodium phosphate buffer, pH 7.0 (Buffer A) . 

15 The column was washed with Buffer A until the optical density at 280 
nm returned to baseline level . The proteins bound to the column 
were then eluted with 1M NaCl in Buffer A. 

The fraction was loaded in 20 ml aliquots onto a gel filtration 
column, Sepharose CL-4B (2.6 x 100 cm), which was equilibrated with 

20 Buffer A. The protein was eluted in Buffer A at a flow rate of 0.75 
mL/min. Fractions with a retention time between 260 minutes and 460 
minutes were pooled and applied at 1 mL/min to a Mono Q 5/5 column 
which was equilibrated with 20 mM Tris-HCl, pH 7.0 (Buffer B) . The 
column was washed with Buffer B until the optical density at 280 nm 

25 returned to baseline level. The proteins bound to the column were 
eluted with a linear gradient of 0 to 1 M NaCl in Buffer B at 
lmL/min for 30 min. One milliliter fractions were collected, serial 
diluted, and subjected to SCR bioassay. Fractions eluted out 
between 0.1 and 0.3 M NaCl were found to have the highest 

30 insecticidal activity. Western analysis of the active fractions 

using pAb TcdA^-syn antibody and pAb Tcd i:Li -syn antibody indicated 

the presence of peptides TcdA ii and TcdA ±ii . 
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(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: Ensign, Jerald C 

Bowen, David J 

Petell, James 

Fatig, Raymond 
10 Schoonover, Sue 

f f rench- Constant , Richard 

Orr, Gregory L 

Merlo, Donald J 

Roberts, Jean L 
15 Rocheleau, Thomas A 

(ii) TITLE OF INVENTION: Insecticidal Protein Toxins from 

Photorhabdus 

20 (iii) NUMBER OF SEQUENCES : 8B 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: DowElanco 

(B) STREET: 9330 Zionsville Road 
25 (C) CITY: Indianapolis 

(D) STATE: IN 

(E) COUNTRY: US 

(F) ZIP : 46268 

30 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

35 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER : 

(B) FILING DATE: 

(C) CLASSIFICATION : 

(Vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER : US 08/063,615 

(B) FILING DATE: 18-MAY-1993 

4 5 (vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/3 95,4 97 

(B) FILING DATE: 28-FEB-1995 

(vii) PRIOR APPLICATION DATA: 
50 (A) APPLICATION NUMBER: US 60/007,255 

(B) FILING DATE: 06-NOV-1995 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/608,423 
55 (B) FILING DATE: 28-FEB-1996 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/705,484 

(B) FILING DATE: 28-AUG-1996 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/74 3,6 99 

(B) FILING DATE : 06-NOV-1996 
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10 



20 



25 



(viii) ATTORNEY/ABRt INFORMATION: 

(A) NAME: Borucki , Andrea T . 

(B) REGISTRATION NUMBER: 33 651 

(C) REFERENCE / DOCKET NUMBER: 503 0 IE 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 317-337-4846 

(B) TELEFAX: 317-337-4847 

(2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 (TcbA i:L N- terminus) 



Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn 
1 5 10 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNE S S : 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

35 ( V ) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 (TcaC N- terminus ) 

Met Gin Asp Ser Pro Glu Val Ser lie Thr Thr Trp 
40 1 * 1° 



(2) INFORMATION FOR SEQ ID NO : 3 : 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 
50 (ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 (TcaBi N-terminus): 

55 Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp Ala 

15 10 15 

Leu Val Ala 

60 
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(2) INFORMA^BWJ FOR SEQ ID NO; 4: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4 (TcaA ilA N-terminus): 

Ala Ser Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
2 0 (A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

2 5 (v) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5 (TcaB^ N-terminus): 

Ala Gly Asp Thr Ala Asn lie Gly Asp 
30 1 5 



(2) INFORMATION FOR SEQ ID NO: 6: 

3 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

4 0 (ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

4 5 Leu Gly Gly Ala Ala Thr Leu Leu Asp Leu L.eu Leu Pro Gin lie 

15 10 15 



50 



60 



(2) INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: N-terminal 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 7 (TccB N-terminus) 

Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu 
15 10 
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50 



(2) INFORMATION FOR|^Q ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 (TccA N- terminus) 

Met Asn Leu Ala Ser Pro Leu lie Ser 

1 5 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 
2 0 (A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

2 5 (v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met lie Asn Leu Asd lie Asn Glu Gin Asn Lys He Met Val Val Ser 
30 1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 10: 

3 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 
IC) STRANDEDNESS: 

(D) TOPOLOGY: linear 
AO (ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

4 5 Ala Ala Lys Asp Val Lys Phe Gly Ser Asp Ala Arg Val Lys Met Leu 

1 5 10 15 



Arg Gly Val Asn 
20 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 
5 5 (A) LENGTH: 7515 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
60 (ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . .7515 
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(xi) SEWBNCE DESCRIPTION: SEQ ID NO:l^(tcjbA gene): 

ATG CAA AAC TCA TTA TCA AGC ACT ATC GAT ACT ATT TGT CAG AAA CTG 4 8 
Met Gin Asn Ser Leu Ser Ser Thr lie Asp Thr lie Cys Gin Lys Leu 
15 10 15 

CAA TTA ACT TGT CCG GCG GAA ATT GCT TTG TAT CCC TTT GAT ACT TTC 96 
Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 
20 25 " 30 

CGG GAA AAA ACT CGG GGA ATG GTT AAT TGG GGG GAA GCA AAA CGG ATT 144 
Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg He 
35 40 45 

15 TAT GAA ATT GCA CAA GCG GAA CAG GAT AGA AAC CTA CTT CAT GAA AAA 192 
Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lvs 
50 55 " 60 



10 



20 



30 



40 



50 



70 



CGT ATT TTT GCC TAT GCT AAT CCG CTG CTG AAA AAC GCT GTT CGG TTG 240 
Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 
6S 70 75 80 



GGT ACC CGG CAA ATG TTG GGT TTT ATA CAA GGT TAT AGT GAT CTG TTT 288 
Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 
85 90 * 95 



GGT AAT CGT GCT GAT AAC TAT GCC GCG CCG GGC TCG GTT GCA TCG ATG 3 36 

Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 
100 105 no 

TTC TCA CCG GCG GCT TAT TTG ACG GAA TTG TAC CGT GAA GCC AAA AAC 3 84 

Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 

115 120 125 

3 5 TTG CAT GAC AGC AGC TCA ATT TAT TAC CTA GAT AAA CGT CGC CCG GAT 4 32 

Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 
130 135 140 



TTA GCA AGC TTA ATG CTC AGC CAG AAA AAT ATG GAT GAG GAA ATT TCA 4 80 
Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu He Ser 
145 150 155 1 160 



ACG CTG GCT CTC TCT AAT GAA TTG TGC CTT GCC GGG ATC GAA AC A AAA 52 8 
Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 
45 165 170 175 



AC A GGA AAA TCA CAA GAT GAA GTG ATG GAT ATG TTG TCA ACT TAT CGT 576 

Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 
180 185 190 

TTA AGT GGA GAG AC A CCT TAT CAT CAC GCT TAT GAA ACT GTT CGT GAA 624 

Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu 
1^5 200 205 

5 5 ATC GTT CAT GAA CGT GAT CCA GGA TTT CGT CAT TTG TCA CAG GCA CCC 672 

He Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 

210 215 220 

ATT GTT GCT GCT AAG CTC GAT CCT GTG ACT TTG TTG GGT ATT AGC TCC 72 0 

60 He Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He Ser Ser 

225 230 235 240 

CAT ATT TCG CCA GAA CTG TAT AAC TTG CTG ATT GAG GAG ATC CCG GAA 76 8 

His lie Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He Pro Glu 

65 245 250 255 



AAA GAT GAA GCC GCG CTT GAT ACG CTT TAT AAA ACA AAC TTT GGC GAT 816 
Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 
260 265 * 270 

ATT ACT ACT GCT CAG TTA ATG TCC CCA AGT TAT CTG GCC CGG TAT TAT 864 
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35 



55 



jj^-let Ser Pro Ser Tyr Leu Ala Ar^|r 



lie Thr Thr Ala Gin I^pMet Ser Pro Ser Tyr Leu Ala Ar§^fr Tyr 

275 280 285 

GGC GTC TCA CCG GAA GAT ATT GCC TAC GTG ACG ACT TCA TTA TCA CAT 912 
5 Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 
290 295 300 

GTT GGA TAT AGC AGT GAT ATT CTG GTT ATT CCG TTG GTC GAT GGT GTG 960 
Val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp Gly Val 
10 305 * 310 315 320 

GGT AAG ATG GAA GTA GTT CGT GTT ACC CGA ACA CCA TCG GAT AAT TAT 1008 
Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 
325 330 335 



ACC AGT CAG ACG AAT TAT ATT GAG CTG TAT CCA CAG GGT GGC GAC AAT 1056 
Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 
340 345 350 



2 0 TAT TTG ATC AAA TAC AAT CTA AGC AAT AGT TTT GGT TTG GAT GAT TTT 1104 

Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 

355 360 365 

TAT CTG CAA TAT AAA GAT GGT TCC GCT GAT TGG ACT GAG ATT GCC CAT 1152 

2 5 Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 

370 375 380 

AAT CCC TAT CCT GAT ATG GTC ATA AAT CAA AAG TAT GAA TCA CAG GCG 12 00 

Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 

30 385 390 395 400 

ACA ATC AAA CGT AGT GAC TCT GAC AAT ATA CTC AGT ATA GGG TTA CAA 12 4 8 

Thr lie Lys Arg Ser Asp Ser Asp Asn lie Leu Ser He Gly Leu Gin 

405 410 415 



AGA TGG CAT AGC GGT AGT TAT AAT TTT GCC GCC GCC AAT TTT AAA ATT 1296 
Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys lie 
420 425 430 



AO GAC CAA TAC TCC CCG AAA GCT TTC CTG CTT AAA ATG AAT AAG GCT ATT 1344 
Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 
435 440 445 

CGG TTG CTC AAA GCT ACC GGC CTC TCT TTT GCT ACG TTG GAG CGT ATT 13 92 
4 5 Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg He 
450 455 460 

GTT GAT AGT GTT AAT AGC ACC AAA TCC ATC ACG GTT GAG GTA TTA AAC 144 0 
Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 
50 465 470 475 480 

AAG GTT TAT CGG GTA AAA TTC TAT ATT GAT CGT TAT GGC ATC AGT GAA 14 88 
Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg Tyr Gly He Ser Glu 
485 490 495 



GAG ACA GCC GCT ATT TTG GCT AAT ATT AAT ATC TCT CAG CAA GCT GTT 153 6 
Glu Thr Ala Ala lie Leu Ala Asn He Asn lie Ser Gin Gin Ala Val 
500 505 510 



60 GGC AAT CAG CTT AGC CAG TTT GAG CAA CTA TTT AAT CAC CCG CCG CTC 1584 
Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 
515 520 525 

AAT GGT ATT CGC TAT GAA ATC AGT GAG GAC AAC TCC AAA CAT CTT CCT 16 3 2 
65 Asn Gly lie Arg Tyr Glu lie Ser Glu Asp Asn Ser Lys His Leu Pro 
530 535 540 

AAT CCT GAT CTG AAC CTT AAA CCA GAC AGT ACC GGT GAT GAT CAA CGC 1680 
Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg 
70 545 550 555 560 
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AAG GCG GTT T^j^A CGC GCG TTT CAG GTT AAC GC^^T GAG TTG TAT 172 6 

Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 
565 570 575 

5 CAG ATG TTA TTG ATC ACT GAT CGT AAA GAA GAC GGT GTT ATC AAA AAT 1776 

Gin Met Leu Leu lie Thr Asp Arg Lys Glu Asp Gly Val lie Lys Asn 
580 585 590 

AAC TTA GAG AAT TTG TCT GAT CTG TAT TTG GTT AGT TTG CTG GCC CAG 1824 

10 Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin 

595 600 605 

ATT CAT AAC CTG ACT ATT GCT GAA TTG AAC ATT TTG TTG GTG ATT TGT 1872 

lie His Asn Leu Thr lie Ala Glu Leu Asn lie Leu Leu Val lie Cys 
15 610 615 620 

GGC TAT GGC GAC ACC AAC ATT TAT CAG ATT ACC GAC GAT AAT TTA GCC 1920 

Gly Tyr Gly Asp Thr Asn lie Tyr Gin He Thr Asp Asp Asn Leu Ala 
625 630 635 640 



20 



AO 



60 



AAA ATA GTG GAA ACA TTG TTG TGG ATC ACT CAA TGG TTG AAG ACC CAA 1968 
Lys He Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys Thr Gin 
645 650 655 



2 5 AAA TGG ACA GTT ACC GAC CTG TTT CTG ATG ACC ACG GCC ACT TAC AGC 2016 

Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 
660 665 670 

ACC ACT TTA ACG CCA GAA ATT AGC AAT CTG ACG GCT ACG TTG TCT TCA 2 064 

30 Thr Thr Leu Thr Pro Glu lie Ser Asn Leu Thr Ala Thr Leu Ser Ser 

675 680 685 

ACT TTG CAT GGC AAA GAG AGT CTG ATT GGG GAA GAT CTG AAA AG A GCA 2112 

Thr Leu His Gly Lys Glu Ser Leu He Gly Glu Asp Leu Lys Arg Ala 
35 690 695 700 

ATG GCG CCT TGC TTC ACT TCG GCT TTG CAT TTG ACT TCT CAA GAA GTT 2160 

Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val 

705 710 715 720 



GCG TAT GAC CTG CTG TTG TGG ATA GAC CAG ATT CAA CCG GCA CAA ATA 220 6 
Ala Tyr Asp Leu Leu Leu Trp He Asp Gin He Gin Pro Ala Gin He 
725 730 735 



4 5 ACT GTT GAT GGG TTT TGG GAA GAA GTG CAA ACA ACA CCA ACC AGC TTG 2256 

Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 
740 745 750 

AAG GTG ATT ACC TTT GCT CAG GTG CTG GCA CAA TTG AGC CTG ATC TAT 2304 

50 Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 
755 760 765 

CGT CGT ATT GGG TTA AGT GAA ACG GAA CTG TCA CTG ATC GTG ACT CAA 2 3 52 

Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 
55 770 775 7B0 

TCT TCT CTG CTA GTG GCA GGC AAA AGC ATA CTG GAT CAC GGT CTG TTA 2 4 00 

Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu 
785 790 795 800 



ACC CTG ATG GCC TTG GAA GGT TTT CAT ACC TGG GTT AAT GGC TTG GGG 244 8 
Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 
805 810 815 



65 CAA CAT GCC TCC TTG ATA TTG GCG GCG TTG AAA GAC GGA GCC TTG ACA 24 96 
Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 
820 825 ' 830 

GTT ACC GAT GTA GCA CAA GCT ATG AAT AAG GAG GAA TCT CTC CTA CAA 2 54 4 
7 0 Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 
835 840 845 
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ATG GCA GCT AAT CAG GTG GAG AAG GAT CTA ACA AAA CTG AC^AGT TGG 2 5 92 

Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 

850 855 860 

5 

ACA CAG ATT GAC GCT ATT CTG CAA TGG TTA CAG ATG TCT TCG GCC TTG 2 64 0 

Thr Gin lie Asp Ala lie Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 

865 870 875 880 

0 GCG GTT TCT CCA CTG GAT CTG GCA GGG ATG ATG GCC CTG AAA TAT GGG 2 6 88 
Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly 
885 890 895 

ATA GAT CAT AAC TAT GCT GCC TGG CAA GCT GCG GCG GCT GCG CTG ATG 27 36 
5 He Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Met 
900 905 910 

GCT GAT CAT GCT AAT CAG GCA CAG AAA AAA CTG GAT GAG ACG TTC AGT 2 7 84 
Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 
0 915 920 925 

AAG GCA TTA TGT AAC TAT TAT ATT AAT GCT GTT GTC GAT AGT GCT GCT 2 83 2 

Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser Ala Ala 
930 935 940 

5 

GGA GTA CGT GAT CGT AAC GGT TTA TAT ACC TAT TTG CTG ATT GAT AAT 2 88 0 

Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu He Asp Asn 
945 950 955 960 

0 CAG GTT TCT GCC GAT GTG ATC ACT TCA CGT ATT GCA GAA GCT ATC GCC 2 92 8 
Gin Val Ser Ala Asp Val He Thr Ser Arg He Ala Glu Ala lie Ala 
965 970 975 

GGT ATT CAA CTG TAC GTT AAC CGG GCT TTA AAC CGA GAT GAA GGT CAG 2 976 
5 Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 
980 ' 985 990 

CTT GCA TCG GAC GTT AGT ACC CGT CAG TTC TTC ACT GAC TGG GAA CGT 3 02 4 
Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 
0 995 1000 1005 

TAC AAT AAA CGT TAC AGT ACT TGG GCT GGT GTC TCT GAA CTG GTC TAT 3 07 2 

Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 
1010 1015 1020 

5 

TAT CCA GAA AAC TAT GTT GAT CCC ACT CAG CGC ATT GGG CAA ACC AAA 3 12 0 

Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 
1025 1030 1035 1040 

0 ATG ATG GAT GCG CTG TTG CAA TCC ATC AAC CAG AGC CAG CTA AAT GCG 316 8 
Met Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala 
1045 1050 1055 

GAT ACG GTG GAA GAT GCT TTC AAA ACT TAT TTG ACC AGC TTT GAG CAG 3216 
5 Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 
1060 1065 1070 

GTA GCA AAT CTG AAA GTA ATT AGT GCT TAC CAC GAT AAT GTG AAT GTG 3 2 64 
Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val 
0 1075 1080 1085 

GAT CAA GGA TTA ACT TAT TTT ATC GGT ATC GAC CAA GCA GCT CCG GGT 3 312 
Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Gly 
1090 1095 1100 



ACG TAT TAC TGG CGT AGT GTT GAT CAC AGC AAA TGT GAA AAT GGC AAG 3 3 60 
Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 
1105 1110 1115 * 1120 

TTT GCC GCT AAT GCT TGG GGT GAG TGG AAT AAA ATT ACC TGT GCT GTC 34 0 8 
Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val 
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1125 H30 H35 

AAT CCT TGG AAA AAT ATC ATC CGT CCG GTT GTT TAT ATG TCC CGC TTA 3456 
Asn Pro Trp Lys Asn lie lie Arg Pro Val Val Tyr Met Ser Arg Leu 
1140 H45 1150 

TAT CTG CTA TGG CTG GAG CAG CAA TCA AAG AAA AGT GAT GAT GGT AAA 3504 
Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 
1155 1160 1165 

ACC ACG ATT TAT CAA TAT AAC TTA AAA CTG GCT CAT ATT CGT TAC GAC 3552 
Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg Tyr Asp 
1170 1175 HBO 

15 GGT AGT TGG AAT ACA CCA TTT ACT TTT GAT GTG ACA GAA AAG GTA AAA 3600 
Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 
1185 1190 H95 1200 



10 



20 



30 



50 



70 



AAT TAC ACG TCG AGT ACT GAT GCT GCT GAA TCT TTA GGG TTG TAT TGT 364 8 
Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys 
1205 1210 1215 



ACT GGT TAT CAA GGG GAA GAC ACT CTA TTA GTT ATG TTC TAT TCG ATG 3 696 
Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 
" 1220 1225 1230 



CAG AGT AGT TAT AGC TCC TAT ACC GAT AAT AAT GCG CCG GTC ACT GGG 3744 

Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 
1235 1240 1245 

CTA TAT ATT TTC GCT GAT ATG TCA TCA GAC AAT ATG ACG AAT GCA CAA 3 792 

Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin 
1250 1255 1260 

35 GCA ACT AAC TAT TGG AAT AAC AGT TAT CCG CAA TTT GAT ACT GTG ATG 3 840 

Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 
1265 1270 1275 1280 

GCA GAT CCG GAT AGC GAC AAT AAA AAA GTC ATA ACC AGA AGA GTT AAT 3 888 

4 0 Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Val Asn 

1285 1290 1295 

AAC CGT TAT GCG GAG GAT TAT GAA ATT CCT TCC TCT GTG ACA AGT AAC 3 936 

Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr Ser Asn 
45 1300 1305 1310 



AGT AAT TAT TCT TGG GGT GAT CAC AGT TTA ACC ATG CTT TAT GGT GGT 3 984 

Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 
1315 1320 1325 

AGT GTT CCT AAT ATT ACT TTT GAA TCG GCG GCA GAA GAT TTA AGG CTA 4 032 

Ser Val Pro Asn lie Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 
1330 1335 1340 

5 5 TCT ACC AAT ATG GCA TTG AGT ATT ATT CAT AAT GGA TAT GCG GGA ACC 4 080 

Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 

1345 1350 1355 1360 

CGC CGT ATA CAA TGT AAT CTT ATG AAA CAA TAC GCT TCA TTA GGT GAT 412 8 

60 Arg Arg lie Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 

1365 1370 1375 

AAA TTT ATA ATT TAT GAT TCA TCA TTT GAT GAT GCA AAC CGT TTT AAT 4176 

Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 
65 1380 1385 1390 



CTG GTG CCA TTG TTT AAA TTC GGA AAA GAC GAG AAC TCA GAT GAT AGT 4 224 
Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Giu Asn Ser Asp Asp Ser 
1395 1400 1405 

ATT TGT ATA TAT AAT GAA AAC CCT TCC TCT GAA GAT AAG AAG TGG TAT 4 2 72 
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G^J^sn Pro Ser Ser Glu Asp Lys hy^^^p 



He Cys He Tyr Asn Gl^&sn Pro Ser Ser Glu Asp Lys Ly^^pp Tyr 

1410 1415 1420 

TTT TCT TCG AAA GAT GAC AAT AAA ACA GCG GAT TAT AAT GOT GGA ACT 4 320 
Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 
1425 1430 1435 1440 

CAA TGT ATA GAT GCT GGA ACC AGT AAC AAA GAT TTT TAT TAT AAT CTC 4368 
Gin Cys He Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 
1445 1450 1455 

CAG GAG ATT GAA GTA ATT AGT GTT ACT GGT GGG TAT TGG TCG AGT TAT 4416 
Gin Glu He Glu Val He Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 
1460 1465 14 70 

AAA ATA TCC AAC CCG ATT AAT ATC AAT ACG GGC ATT GAT AGT GCT AAA 44 64 
Lys He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
1475 1480 1485 

GTA AAA GTC ACC GTA AAA GCG GGT GGT GAC GAT CAA ATC TTT ACT GCT 4 512 
Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala 
1490 1495 1500 

GAT AAT AGT ACC TAT GTT CCT CAG CAA CCG GCA CCC AGT TTT GAG GAG 4 56 0 
Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 
1505 1510 1515 1520 

ATG ATT TAT CAG TTC AAT AAC CTG ACA ATA GAT TGT AAG AAT TTA AAT 4 608 
Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 
1525 1530 ' 1535 

TTC ATC GAC AAT CAG GCA CAT ATT GAG ATT GAT TTC ACC GCT ACG GCA 4 656 
Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Ala 
1540 1545 1550 

CAA GAT GGC CGA TTC TTG GGT GCA GAA ACT TTT ATT ATC CCG GTA ACT 4 7 04 
Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 
1555 1560 1565 

AAA AAA GTT CTC GGT ACT GAG AAC GTG ATT GCG TTA TAT AGC GAA AAT 47 5 2 
Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 
1570 1575 1580 

AAC GGT GTT CAA TAT ATG CAA ATT GGC GCA TAT CGT ACC CGT TTG AAT 4 80 0 
Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 
1585 1590 1595 ~ 1600 

ACG TTA TTC GCT CAA CAG TTG GTT AGC CGT GCT AAT CGT GGC ATT GAT 4 84 8 
Thr hh±i* Ph e - A la Gin Gin Leu Val Ser Arg Ala Asn Arg Gly lie Asp 
1605 1610 _ 1615 

GCA GTG CTC AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAA TTA GGA 4 8 96 
Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly 
1620 1625 1630 

GCG GGC ACA TAT GTG CAG CTT GTG TTG GAT AAA TAT GAT GAG TCT ATT 4 94 4 
Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser lie 
1635 1640 1645 

CAT GGC ACT AAT AAA AGC TTT GCT ATT GAA TAT GTT GAT ATA TTT AAA 4 992 
His Gly Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He Phe Lys 
1650 1655 1660 

GAG AAC GAT AGT TTT GTG ATT TAT CAA GGA GAA CTT AGC GAA ACA AGT 504 0 
Glu Asn Asp Ser Phe Val He Tyr Gin Gly Glu Leu Ser Glu Thr Ser 
1665 1670 1675 1680 

CAA ACT GTT GTG AAA GTT TTC TTA TCC TAT TTT ATA GAG GCG ACT GGA 50 8 8 
Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe He Glu Ala Thr Gly 
1685 1690 1695 
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AAT AAG AAC <^^^TTA TGG GTA CGT GCT AAA TAC C^^^^G GAA ACG ACT 513 6 

Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 
1700 1705 1710 

5 GAT AAG ATC TTG TTC GAC CGT ACT GAT GAG AAA GAT CCG CAC GGT TGG 5184 
Asp Lys lie Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 
1715 1720 1725 

TTT CTC AGC GAC GAT CAC AAG ACC TTT AGT GGT CTC TCT TCC GCA CAG 5232 
10 Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 
1730 1735 1740 

GCA TTA AAG AAC GAC AGT GAA CCG ATG GAT TTC TCT GGC GCC AAT GCT 52 80 
Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 
15 1745 1750 1755 1760 

CTC TAT TTC TGG GAA CTG TTC TAT TAC ACG CCG ATG ATG ATG GCT CAT 5328 
Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 
1765 1770 1775 



20 



40 



60 



65 



70 



CGT TTG TTG CAG GAA CAG AAT TTT GAT GCG GCG AAC CAT TGG TTC CGT 5376 
Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 
1780 1785 1790 



2 5 TAT GTC TGG AGT CCA TCC GGT TAT ATC GTT GAT GGT AAA ATT GCT ATC 54 2 4 

Tyr Val Trp Ser Pro Ser Gly Tyr He Val Asp Gly Lys lie Ala He 
1795 1800 1805 

TAC CAC TGG AAC GTG CGA CCG CTG GAA GAA GAC ACC AGT TGG AAT GCA 5472 

30 Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 

1810 1815 1820 

CAA CAA CTG GAC TCC ACC GAT CCA GAT GCT GTA GCC CAA GAT GAT CCG 5 52 0 

Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 
35 1825 1830 1835 1840 

ATG CAC TAC AAG GTG GCT ACC TTT ATG GCG ACG TTG GAT CTG CTA ATG 556 8 

Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met 
1845 1850 1855 



GCC CGT GGT GAT GCT GCT TAC CGC CAG TTA GAG CGT GAT ACG TTG GCT 5616 
Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 
1860 1865 1870 



4 5 GAA GCT AAA ATG TGG TAT ACA CAG GCG CTT AAT CTG TTG GGT GAT GAG 5664 

Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 
1875 1880 1885 

CCA CAA GTG ATG CTG AGT ACG ACT TGG GCT AAT CCA ACA TTG GGT AAT 5712 

50 Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 
1890 1895 1900 

GCT GCT TCA AAA ACC ACA CAG CAG GTT CGT CAG CAA GTG CTT ACC CAG 57 60 

Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 
55 1905 1910 1915 1920 



TTG 


CGT 


CTC 


AAT 


AGC 


AGG 


GTA 


AAA 


ACC 


CCG 


Leu 


Arg 


Leu 


Asn 


Ser 


Arg 


Val 


Lys 


Thr 


Pro 










1925 








1930 


TCC 


CTG 


ACC 


GCT 


TTA 


TTC 


CTG 


CCG 


CAG 


GAA 


Ser 


Leu 


Thr 


Ala 


Leu 


Phe 


Leu 


Pro 


Gin 


Glu 








1940 








1945 


TAC 


TGG 


CGG 


ACA 


CTG 


GCG 


CAG 


CGT 


ATG 


TTT 


Tyr 


Trp Arg 


Thr 


Leu 


Ala 


Gin 


Arg 


Met 


Phe 






1955 








1960 




TCG 


ATT 


GAC 


GGC 


CAG 


CCG 


CTC 


TCC 


TTG 


CCG 


Ser 


He 


Asp Gly 


Gin 


Pro 


Leu 


Ser 


Leu 


Pro 



1935 



1950 



1965 



1970 1975 1980 

-160- 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 9808932A1 J_> 



WO 98/08932 



PCT/US97/07657 



25 



65 



GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA GCT TCT CAaGGG GGA 6000 

Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 

1985 1990 1995 2000 

5 

GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC CGC TTC CCT CAA ATG 6 048 

Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His Arg Phe Pro Gin Met 
2005 2010 2015 

10 CTA GAA GGG GCA CGG GGC TTG GTT AAC CAG CTT ATA CAG TTC GGT AGT 6096 

Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu lie Gin Phe Gly Ser 
2020 2025 2030 

TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG GAA GCT ATG AGT CAA 6144 

15 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 
2035 2040 * 2045 

CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG ACC AGT ATT CGT ATG 6192 

Leu Leu Gin Thr Gin Ala Ser Glu Leu lie Leu Thr Ser lie Arg Met 
20 2050 2055 2060 

CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA AAA ACC GCC TTG CAA 6 24 0 

Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 

2065 2070 2075 2080 



GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC AGC TAT AGC CAA CTG 62 88 
Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 
2085 2090 2095 



30 TAT GAG GAG AAC ATC AAC GCA GGT GAG CAG CGA GCG CTG GCG TTA CGC 63 3 6 
Tyr Glu Glu Asn lie Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 
2100 2105 2110 

TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG ATT TCC CGT ATG GCA 6 384 
35 Ser Glu Ser Ala lie Glu Ser Gin Gly Ala Gin lie Ser Arg Met Ala 
2115 2120 2125 

GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC GGC CTG GCT GAT GGC 6432 
Gly Ala Gly Val Asp Met Ala Pro Asn He Phe Gly Leu Ala Asp Gly 
40 2130 2135 2140 

GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC GCT GAC GGT ATT GAG 64 8 0 

Gly Met His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 
2145 2150 2155 2160 

45 

TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG AAA GTT GCT CAG TCG 652 8 

Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 
2165 2170 " 2175 

50 GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA ATT CAG CGT GAC AAC 6 576 
Glu lie Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 
2180 2185 2190 

GCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA CTG GAA TCA CTG TCT 6624 
55 Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 
2195 2200 2205 

ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG TAC CTG AAA ACC CAG 6 672 
He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 
60 2210 2215 2220 

CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA AG A AGC AAA TTC AGT 6 72 0 
Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 
2225 2230 2235 " 2240 



AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT TTG TCA GGT ATT TAT 6768 
Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly He Tyr 
2245 2250 2255 



7 0 TTC CAG TTC TAT GAC TTG GCC GTA TCA CGT TGC CTG ATG GCA GAG CAA 6B16 
Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 
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10 



30 



50 



2265 . 2270 

TCC TAT CAA TGG GAA GCT AAT GAT AAT TCC ATT AGC TTT GTC AAA CCG 6864 
Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser lie Ser Phe Val Lys Pro 
2275 2280 2285 

GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG TGT GGA GAA GCT TTG 6912 
Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 
2290 2295 2300 

ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT CTG AAA TGG GAA TCT 696 0 
He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 
2305 2310 2315 2320 



15 CGC GCT TTG GAA GTA GAA CGC ACG GTT TCA TTG GCA GTG GTT TAT GAT 700 B 
Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 
2325 2330 2335 

TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG GAA CAA ATA CCT GCA 7 056 
20 Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He Pro Ala 
2340 2345 2350 

TTA TTG GAT AAG GGG GAG GGA ACA GCA GGA ACT AAA GAA AAT GGG TTA 7104 
Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 
25 2355 2360 2365 

TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC AAA TTG TCC GAC TTG 7152 
Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 
2370 2375 2380 



AAA CTG GGA ACG GAT TAT CCA GAC AGT ATC GTT GGT AGC AAC AAG GTT 7200 
Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn Lys Val 
2385 2390 2395 2400 



3 5 CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT GCA TTG GTT GGG CCT 7 24 8 

Arg Arg He Lys Gin lie Ser Val Ser Leu Pro Ala Leu Val Gly Pro 
2405 2410 2415 

TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT GGC AGT ACT CAA TTG 7 2 96 

4 0 Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 

2420 2425 ' 2430 

CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT GGT ACC AAT GAT AGT 7 344 
Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 
45 2435 2440 2445 

GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA TAC CTG CCA TTT GAA 73 92 
Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 
Z^TO 2455 2460 



GGT ATT GCT CTT GAT GAT CAG GGT ACA CTG AAT CTT CAA TTT CCC AAT 74 4 0 
Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 
2465 2470 2475 2480 



55 GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT ATG AGC GAT ATT ATT 74 8 8 

Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp He He 
2485 2490 * 2495 

TTG CAT ATT CGT TAT ACC ATC CGT TAA 7515 

60 Leu His He Arg Tyr Thr He Arg * 

2500 2505 
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(2) INFORMATION FOR^EQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2504 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 12 (TcbA protein) : 

Met Gin Asn Ser Leu Ser Ser Thr lie Asp Thr lie Cys Gin Lys Leu 
15 10 15 

Gin Leu Thr Cys Pro Ala Glu lie Ala Leu Tyr Pro Phe Asp Thr Phe 
20 25 30 

Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg lie 
35 40 45 

Tyr Glu lie Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 
50 55 60 

Arg lie Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 
65 70 75 B0 

Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 
85 90 95 

Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 
100 105 110 

Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 
115 120 125 

Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 
130 135 140 

Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu He Ser 
145 150 155 160 

Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 
165 170 175 

Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 
180 165 * 190 

Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu 
195 200 205 

He Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 
210 215 220 

lie Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly lie Ser Ser 
225 230 235 240 

His He Ser Pro Glu Leu Tyr Asn Leu Leu lie Glu Glu He Pro Glu 
245 250 255 

Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 
260 265 270 

He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 
275 280 285 

Gly Val Ser Pro Glu Asp lie Ala Tyr Val Thr Thr Ser Leu Ser His 
290 295 300 

Val Gly Tyr Ser Ser Asp He Leu Val lie Pro Leu Val Asp Gly Val 
305 310 315 320 
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Gly Lys Met 




Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 
325 330 335 
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Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 
340 345 350 

Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 
355 360 365 

Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 
370 375 380 

Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 
385 390 395 400 

Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin 
405 410 415 

Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys He 
420 425 430 

Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 
435 440 445 

Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg lie 
450 455 460 

Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 
465 470 475 480 

Lys Val Tyr Arg Val Lys Phe Tyr lie Asp Arg Tyr Gly He Ser Glu 
485 490 495 

Glu Thr Ala Ala lie Leu Ala Asn lie Asn lie Ser Gin Gin Ala Val 
500 505 510 

Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 
515 520 525 

Asn Gly He Arg Tyr Glu lie Ser Glu Asp Asn Ser Lys His Leu Pro 
530 535 540 

Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg 
545 550 555 560 

Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 
565 570 575 

Gin Met Leu Leu lie Thr Asp Arg Lys Glu Asp Gly Val He Lys Asn 
580 585 590 

Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin 
595 600 605 

He His Asn Leu Thr lie Ala Glu Leu Asn He Leu Leu Val lie Cys 
610 615 620 

Gly Tyr Gly Asp Thr Asn lie Tyr Gin lie Thr Asp Asp Asn Leu Ala 
625 630 635 640 

Lys lie Val Glu Thr Leu Leu Trp lie Thr Gin Trp Leu Lys Thr Gin 
645 650 655 

Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 
660 665 670 

Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 
675 680 685 

Thr Leu His Gly Lys Glu Ser Leu lie Gly Glu Asp Leu Lys Arg Ala 
690 695 700 
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Met Ala Pro Cys Ph Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val 
705 710 715 720 

Ala Tyr Asp Leu Leu Leu Trp lie Asp Gin He Gin Pro Ala Gin He 
725 730 735 

Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 
740 745 750 

Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 
755 760 765 



Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 

15 770 775 780 

Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu 

785 790 795 800 

2 0 Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 

805 810 815 



Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 
820 825 830 

Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 

835 840 845 



Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 
30 850 855 860 

Thr Gin lie Asp Ala He Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 

865 870 875 880 

3 5 Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly 

885 890 895 



lie Asd His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Met 
900 905 910 

Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 

915 920 925 



Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser Ala Ala 

45 930 935 940 

Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu He Asp Asn 

945 950 955 960 

50 Gin Val Ser Ala Asp Val He Thr Ser Arg He Ala Glu Ala He Ala 

965 970 975 



Gly lie Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 
980 985 990 

Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 
995 1000 1005 



Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 
60 * 1010 * ~ 1015 1020 

Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 
1025 1030 1035 1040 

65 Met Met Asp Ala Leu Leu Gin Ser lie Asn Gin Ser Gin Leu Asn Ala 

1045 1050 1055 

Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 
1060 1065 1070 

70 

Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val 
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Asp Gin Gly Leu Thr Tyr Phe lie Gly lie Asp Gin Ala Ala Pro Gly 
1090 1095 1100 

Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 
1105 1110 1115 112C 

Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys lie Thr Cys Ala Val 
1125 1130 1135 

Asn Pro Trp Lys Asn lie lie Arg Pro Val Val Tyr Met Ser Arg Leu 
1140 1145 1150 

Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 
1155 1160 ' 1165 

Thr Thr lie Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg Tyr Asp 
1170 1175 1180 

Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 
1185 1190 1195 120C 

Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys 
1205 1210 1215 

Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 
1220 1225 1230 

Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 
1235 1240 1245 

Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin 
1250 1255 1260 

Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 
1265 1270 1275 128C 

Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Val Asn 
1285 1290 1295 

Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr Ser Asn 
1300 1305 1310 

Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 
1315 1320 1325 

Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 
1330 1335 1340 

Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 
1345 1350 1355 136< 

Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 
1365 1370 1375 

Lys Phe He lie Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 
1380 1385 1390 

Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 
1395 1400 1405 

He Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 
1410 1415 1420 

Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Glv Gly Thr 
1425 1430 1435 144 

Gin Cys He Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 
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Gin Glu lie Glu Val H^Ser Val Thr Gly Gly Tyr Trp Sel^pir Tyr 

1460 1465 1470 

Lys lie Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
5 1475 1480 1485 

Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala 
1490 1495 1500 

10 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 
1505 1510 1515 1520 



Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 
1525 1530 1535 

Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Ala 
1540 1545 1550 



Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 
20 1555 1560 1565 

Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 
1570 1575 1580 

2 5 Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 
1585 1590 1595 1600 



Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly He Asp 
1605 1610 1615 

Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly 
1620 1625 1630 



Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser He 
35 1635 1640 1645 

His Gly Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He Phe Lys 
1650 1655 1660 

4 0 Glu Asn Asp Ser Phe Val lie Tyr Gin Gly Glu Leu Ser Glu Thr Ser 
1665 1670 1675 1680 



Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe He Glu Ala Thr Gly 
1685 1690 1695 

Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 
1700 1705 1710 



Asp Lys He Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 
50 1715 1720 ~ 1725 

Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 
1730 1735 ' 1740 

5 5 Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 
1745 1750 * 1755 1760 



Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 
1765 1770 1775 

Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 
1780 1785 1790 



Tyr Val Trp Ser Pro Ser Gly Tyr He Val Asp Gly Lys He Ala He 
65 1795 1800 1805 

Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 
1810 1815 1820 

70 Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 
1825 1830 1835 1840 
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Met His Tyr Lys Val Ala 
1845 



Thr Phe Met Ala Thr 
1850 



Leu Asp Leu Leu Met 
1855 
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Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 
1860 1865 1870 

Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 
1875 1880 1885 

Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 
1890 1895 1900 

Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 
1905 1910 1915 1920 

Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 
1925 1930 1935 

Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Gly 
1940 1945 1950 

Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 
1955 1960 1965 

Ser lie Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 
1970 1975 1980 

Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 
1985 1990 1995 2000 

Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His Arg Phe Pro Gin Met 
2005 2010 2015 

Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu lie Gin Phe Gly Ser 
2020 2025 2030 

Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 
2035 2040 * 2045 

Leu Leu Gin Thr Gin Ala Ser Glu Leu lie Leu Thr Ser lie Arg Met 
2050 2055 2060 

Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 
2065 2070 2075 208C 

Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 
2085 2090 . 2095 

Tyr Glu Glu Asn lie Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 
2100 2105 2110 

Ser Glu Ser Ala lie Glu Ser Gin Gly Ala Gin lie Ser Arg Met Ala 
2115 2120 2125 

Gly Ala Gly Val Asp Met Ala Pro Asn lie Phe Gly Leu Ala Asp Gly 
2130 2135 2140 

Gly Met His Tyr Gly Ala lie Ala Tyr Ala lie Ala Asp Gly He Glu 
2145 2150 2155 216< 

Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 
2165 2170 2175 

Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 
2180 2185 2190 

Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 
2195 2200 2205 

He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 
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2210 ^^215 2220 

Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 

2225 2230 2235 " ' 2240 

5 

Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly lie Tyr 
2245 2250 2255 

Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 
10 2260 2265 2270 

Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser lie Ser Phe Val Lys Pro 
2275 2280 22B5 

15 Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 
2290 2295 2300 



lie Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 
2305 2310 2315 2320 

Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 
2325 2330 2335 



Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin lie Pro Ala 
25 2340 2345 2350 

Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 
2355 2360 2365 

30 Ser Leu Ala Asn Ala lie Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 
2370 2375 2380 



Lys Leu Gly Thr Asp Tyr Pro Asp Ser lie Val Gly Ser Asn Lys Val 
2385 2390 2395 2400 

Arg Arg lie Lys Gin lie Ser Val Ser Leu Pro Ala Leu Val Gly Pro 
2405 2410 2415 



Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 
40 2420 2425 * * 2430 

Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 
2435 2440 2445 

4 5 Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 

2450 2455 2460 



Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 
2465 2470 2475 2480 

Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp He He 
2485 2490 2495 



Leu His He Arg Tyr Thr He Arg * 
55 2500 2505 



(2) INFORMATION FOR SEQ ID NO: 13: 

60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
65 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13 (TcdA i:L N- terminus ) : 
Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Xaa Ala 
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(2) INFORMATION FOR SEQ ID NO: 14: 

5 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 14 {TcdB N~terminus) : 

15 Met Gin Asn Ser Gin Thr Phe Ser Val Gly Glu Leu 

15 10 



20 



35 



50 



60 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 (TcaA^ N-terminus) 

30 Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr 

1 5 " 10 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
4 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 (TcbA N- terminus) 

4 5 Met Gin Asn Ser Leu 

1 5 



(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
5 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 (TcdA^ -FT111 internal 
peptide) : 

Ala Phe Asn lie Asp Asp Val Ser Leu Phe 
15 10 
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(2) INFORMATION FOWffEQ ID NO: 18: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 
5 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 (TcdA^- PT79 internal 

peptide) : 



Phe lie Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 
2 0 (A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 (TcaBi- PT158 internal 
peptide) : 



lie Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala lie Gly Ser 
30 1 5 10 15 



Leu Gin Leu Phe lie 
20 



(2) INFORMATION FOR SEQ ID NO: 20: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
4 0 (B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

4 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 (TcaBj - — HT 108 internal 

peptide) : 



Met Tyr Tyr lie Gin Ala Gin Gin Leu Leu Gly Pro 
15 10 



(2) INFORMATION FOR SEQ ID NO: 21: 



<i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 (TcbA li - PT103 internal 
peptide) : 
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Gly He 
l 




Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro 
5 10 15 



Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 

20 25 



10 



15 



20 



25 



30 



35 



40 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE : peptide 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 22 (TcbA i:L - PT56 internal 
peptide) : 

He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
15 10 15 



(2) INFORMATION FOR SEQ ID N0:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23 (TcbA- PT81 (a) 
internal peptide) : 

Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys 
15 10 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 (TcbA^- PT81 (b) 
internal peptide) : 

Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly 
15 10 15 

Val Gin Tyr Met Gin He 



20 
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(2) INFORMATION FO^BCQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6054 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

<ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 
10 (A) NAME/KEY: CDS 

(B) LOCATION: 1 . .43 

(D) OTHER INFORMATION: /product = "end of TcaA^" 

(XX) FEATURE: 

(A) NAME /KEY : RBS 

15 (B) LOCATION: 51 58 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 65 3634 

(D) OTHER INFORMATION: /product = "TcaBi" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

A GTA GCC CAA AAC TTA AGT GCC GCA ATC AGC AAT CGT CAG TAACCGGATA 
Val Ala Gin Asn Leu Ser Ala Ala He Ser Asn Arg Gin 



50 



AAGAAGGAAT TGATT ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA 100 
Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu 
1 5 10 

GCG CGC CGT GAT GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC 14 8 
Ala Arg Arg Asp Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro 
15 20 25 

35 GCA GAT TTA AAA GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT 196 
Ala Asp Leu Lys Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr 
30 35 40 

CTG TTG CTG GAT ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG 2 44 
4 0 Leu Leu Leu Asp Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu 
45 * 50 55 60 

TCC GAA GCG ATT GGC AGT CTG CAA TTG TTT ATT CAT CGT GCG ATA GAG 2 92 
Ser Glu Ala He Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu 
45 65 70 75 

GGC TAT GAC GGC ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT 34 0 
Gly Tyr Asp Gly Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp 
80 85 90 



GAA CAG TTT TTA TAT AAC TGG GAT AGT TTT AAC CAC CGT TAT AGC ACT 3 88 
Glu Gin Phe Leu Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr 
95 * 100 105 



5 5 TGG GCT GGC AAG GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT 4 36 
Trp Ala Gly Lys Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp 
110 115 120 

CCA ACA TTG CGA TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA 4 84 
60 Pro Thr Leu Arg Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin 
125 130 135 140 

GGT ATT TCT CAA GGG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA 5 32 
Gly He Ser Gin Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu 
65 145 150 155 

CGT GAT TAT CTA ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT 58 0 
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Arg Asp Tyr IMrlle Ser Tyr Asp Thr Leu Ala ThT^Leu Asp Tyr He 

160 165 170 

ACT GCC TGC CAA GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT 628 

5 Thr Ala Cys Gin Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg 

175 180 185 

ACA CAG AAT GCA CCC TAT GCA TTT TAT TGG CGA AAA TTA ACT TTA GTC 676 

Thr Gin Asn Ala Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val 
10 190 195 200 

ACT GAT GGC GGT AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA GCA 724 

Thr Asp Gly Gly Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala 
205 * " 210 215 220 



ATT AAT GCC GGG ATT AGT GAG GCA TAT TCA GGG CAT GTC GAG CCT TTC 7 72 
He Asn Ala Gly He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe 
225 230 235 



20 TGG GAA AAT AAC AAG CTG CAC ATC CGT TGG TTT ACT ATC TCG AAA GAA 820 
Trp Glu Asn Asn Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu 
240 245 250 

GAT AAA ATA GAT TTT GTT TAT AAA AAC ATC TGG GTG ATG AGT AGC GAT 868 
2 5 Asp Lys He Asp Phe Val Tyr Lys Asn He Trp Val Met Ser Ser Asp 
255 260 265 

TAT AGC TGG GCA TCA AAG AAA AAA ATC TTG GAA CTT TCT TTT ACT GAC 916 
Tyr Ser Trp Ala Ser Lys Lys Lys lie Leu Glu Leu Ser Phe Thr Asp 
30 270 275 280 

TAC AAT AGA GTT GGA GCA ACA GGA TCA TCA AGC CCG ACT GAA GTA GCT 964 
Tyr Asn Arg Val Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala 
285 290 295 300 



TCA CAA TAT GGT TCT GAT GCT CAG ATG AAT ATT TCT GAT GAT GGG ACT 1012 
Ser Gin Tyr Gly Ser Asp Ala Gin Met Asn He Ser Asp Asp Gly Thr 
305 310 315 



4 0 GTA CTT ATT TTT CAG AAT GCC GGC GGA GCT ACT CCC AGT ACT GGA GTG 106 0 
Val Leu He Phe Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val 
320 325 330 

ACG TTA TGT TAT GAC TCT GGC AAC GTG ATT AAG AAC CTA TCT AGT ACA 1108 
4 5 Thr Leu Cys Tyr Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr 
335 340 345 

GGA AGT GCA AAT TTA TCG TCA AAG GAT TAT GCC ACA ACT AAA TTA CGC 1156 
Gly^Se -i Ala -Asn Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg 
50 350 355 360 

ATG TGT CAT GGA CAA AGT TAC AAT GAT AAT AAC TAC TGC AAT TTT ACA 12 04 
Met Cys His Gly Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr 
365 370 375 380 



CTC TCT ATT AAT ACA ATA GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA 12 52 
Leu Ser He Asn Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser 
385 390 395 



60 GAT GGA AAA CAA TTT ACA CCA CCT TCT GGT TCT GCC ATT GAT TTA CAC 13 00 
Asp Gly Lys Gin Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His 
400 405 410 

CTC CCT AAT TAT GTA GAT CTC AAC GCG CTA TTA GAT ATT AGC CTC GAT 134 8 
65 Leu Pro Asn Tyr Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp 
415 420 425 

TCA CTA CTT AAT TAT GAC GTT CAG GGG CAG TTT GGC GGA TCT AAT CCG 13 96 
Ser Leu Leu Asn Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro 
70 430 * 435 440 



-174- 

SUBSTtTUTE SHEET (RULE 26) 



BNSDOCID: <WO 9808932A1 J_> 



WO 98/08932 



PCTAJS97/07657 



40 



60 



^^CCC TAT GGT ATT TAT CTA TGG gJ^^TC 



GTT GAT AAT TTC AGT oBT CCC TAT GGT ATT TAT CTA TGG GJl^^TC TTC 14 44 

Val Asp Asn Phe Ser Gly Pro Tyr Gly lie Tyr Leu Trp Glu lie Phe 
445 450 455 460 

5 TTC CAT ATT CCG TTC CTT GTT ACG GTC CGT ATG CAA ACC GAA CAA CGT 14 92 

Phe His lie Pro Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg 
465 470 475 

TAC GAA GAC GCG GAC ACT TGG TAC AAA TAT ATT TTC CGC AGC GCC GGT 154 0 

10 Tyr Glu Asp Ala Asp Thr Trp Tyr Lys Tyr lie Phe Arg Ser Ala Gly 
480 485 490 

TAT CGC GAT GCT AAT GGC CAG CTC ATT ATG GAT GGC AGT AAA CCA CGT 15 88 

Tyr Arg Asp Ala Asn Gly Gin Leu lie Met Asp Gly Ser Lys Pro Arg 

15 495 500 505 

TAT TGG AAT GTG ATG CCA TTG CAA CTG GAT ACC GCA TGG GAT ACC ACA 163 6 

Tyr Trp Asn Val Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr 
510 515 520 

20 

CAG CCC GCC ACC ACT GAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG 16 84 

Gin Pro Ala Thr Thr Asp Pro Asp Val lie Ala Met Ala Asp Pro Met 
525 530 535 540 

2 5 CAT TAC AAG CTG GCG ATA TTC CTG CAT ACC CTT GAT CTA TTG ATT GCC 17 3 2 

His Tyr Lys Leu Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala 
545 550 555 

CGA GGC GAC AGC GCT TAC CGT CAA CTT GAA CGC GAT ACT CTA GTC GAA 17 8 0 

30 Arg Gly Asp Ser Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu 
560 565 570 

GCC AAA ATG TAC TAC ATT CAG GCA CAA CAG CTA CTG GGA CCG CGC CCT 182 8 

Ala Lys Met Tyr Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro 

25 ' 575 580 585 

GAT ATC CAT ACC ACC AAT ACT TGG CCA AAT CCC ACC TTG AGT AAA GAA 187 6 

Asp He His Thr Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu 
590 595 600 



GCT GGC GCT ATT GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG GTG ATG 1924 
Ala Gly Ala He Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met 
605 610 615 620 



4 5 ACG TTC GCT GCC TGG CTA AGC GCA GGC GAT ACC GCA AAT ATT GGC GAC 1972 

Thr Phe Ala Ala Trp Leu Ser Ala Gly Asp Thr Ala Asn lie Gly Asp 
625 630 635 

GGT GAT TTC TTG CCA CCG TAC AAC GAT GTA CTA CTC GGT TAC TGG GAT 2 020 

50 Gly Asp Phe Leu Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp 
640 645 650 

AAA CTT GAG TTA CGC CTA TAC AAC CTG CGC CAC AAT CTG AGT CTG GAT 2 068 

Lys Leu Glu Leu Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp 

55 655 660 665 

GGT CAA CCG CTA AAT CTG CCA CTG TAT GCC ACG CCG GTA GAC CCG AAA 2116 

Gly Gin Pro Leu Asn Leu Pro Leu Tyr Ala Thr Pro Val Asp Pro Lys 
670 675 ' 680 



ACC CTG CAA CGC CAG CAA GCC GGA GGG GAC GGT ACA GGC AGT AGT CCG 2164 
Thr Leu Gin Arg Gin Gin Ala Gly Gly Asp Gly Thr Gly Ser Ser Pro 
685 690 695 700 



65 GCT GGT GGT CAA GGC AGT GTT CAG GGC TGG CGC TAT CCG TTA TTG GTA 2 212 
Ala Gly Gly Gin Gly Ser Val Gin Gly Trp Arg Tyr Pro Leu Leu Val 
705 710 715 

GAA CGC GCC CGC TCT GCC GTG AGT TTG TTG ACT CAG TTC GGC AAC AGC 22 60 
7 0 Glu Arg Ala Arg Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser 
720 725 730 
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TTA CAA ACA ACG TTA GAA CAT CAG GAT AAT GAA AAA ATG ACG ATA CTG 2308 

Leu Gin Thr Thr Leu Glu His Gin Asp Asn Glu Lys Met Thr lie Leu 

735 740 745 

5 

TTG CAG ACT CAA CAG GAA GCC ATC CTG AAA CAT CAG CAC GAT ATA CAA 23 56 

Leu Gin Thr Gin Gin Glu Ala lie Leu Lys His Gin His Asp lie Gin 

750 755 760 

10 CAA AAT AAT CTA AAA GGA TTA CAA CAC AGC CTG ACC GCA TTA CAG GCT 24 04 
Gin Asn Asn Leu Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala 
765 770 775 780 

AGC CGT GAT GGC GAC ACA TTG CGG CAA AAA CAT TAC AGC GAC CTG ATT 2452 
15 Ser Arg Asp Gly Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu lie 

785 790 795 

AAC GGT GGT CTA TCT GCG GCA GAA ATC GCC GGT CTG ACA CTA CGC AGC 2500 
Asn Gly Gly Leu Ser Ala Ala Glu lie Ala Gly Leu Thr Leu Arg Ser 
20 800 805 810 

ACC GCC ATG ATT ACC AAT GGC GTT GCA ACG GGA TTG CTG ATT GCC GGC 254 8 
Thr Ala Met lie Thr Asn Gly Val Ala Thr Gly Leu Leu He Ala Gly 
815 820 825 



25 



65 



GGA ATC GCC AAC GCG GTA CCT AAC GTC TTC GGG CTG GCT AAC GGT GGA 25 96 
Gly He Ala Asn Ala Val Pro Asn Val Phe Gly Leu Ala Asn Gly Gly 
830 835 840 



30 TCG GAA TGG GGA GCG CCA TTA ATT GGC TCC GGG CAA GCA ACC CAA GTT 2 644 
Ser Glu Trp Gly Ala Pro Leu He Gly Ser Gly Gin Ala Thr Gin Val 
845 850 855 860 

GGC GCC GGC ATC CAG GAT CAG AGC GCG GGC ATT TCA GAA GTG ACA GCA 2692 
35 Gly Ala Gly He Gin Asp Gin Ser Ala Gly He Ser Glu Val Thr Ala 

865 870 875 

GGC TAT CAG CGT CGT CAG GAA GAA TGG GCA TTG CAA CGG GAT ATT GCT 2 74 0 
Gly Tyr Gin Arg Arg Gin Glu Glu Trp Ala Leu Gin Arg Asp He Ala 
40 880 885 890 

GAT AAC GAA ATA ACC CAA CTG GAT GCC CAG ATA CAA AGC CTG CAA GAG 2788 
Asp Asn Glu He Thr Gin Leu Asp Ala Gin He Gin Ser Leu Gin Glu 
8 95 900 905 

CAA ATC ACG ATG GCA CAA AAA CAG ATC ACG CTC TCT GAA ACC GAA CAA 2 836 
Gin He Thr Met Ala Gin Lys Gin He Thr Leu Ser Glu Thr Glu Gin 
910 915 920 

50 GCG AAT GCC CAA GCG ATT TAT GAC CTG CAA ACC ACT CGT TTT ACC GGG 2 8 84 
Ala Asn Ala Gin Ala He Tyr Asp Leu Gin Thr Thr Arg Phe Thr Gly 
925 930 935 940 

CAG GCA CTG TAT AAC TGG ATG GCC GGT CGT CTC TCC GCG CTC TAT TAC 2 932 
55 Gin Ala Leu Tyr Asn Trp Met Ala Gly Arg Leu Ser Ala Leu Tyr Tyr 

945 950 955 

CAA ATG TAT GAT TCC ACT CTG CCA ATC TGT CTC CAG CCA AAA GCC GCA 2 980 
Gin Met Tyr Asp Ser Thr Leu Pro He Cys Leu Gin Pro Lys Ala Ala 
60 960 965 970 

TTA GTA CAG GAA TTA GGC GAG AAA GAG AGC GAC AGT CTT TTC CAG GTT 3 02 8 
Leu Val Gin Glu Leu Gly Glu Lys Glu Ser Asp Ser Leu Phe Gin Val 
975 980 1 985 



CCG GTG TGG AAT GAT CTG TGG CAA GGG CTG TTA GCA GGA GAA GGT TTA 3 076 
Pro Val Trp Asn Asp Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu 
990 995 1000 



7 0 AGT TCA GAG CTA CAG AAA CTG GAT GCC ATC TGG CTT GCA CGT GGT GGT 3124 
Ser Ser Glu Leu Gin Lys Leu Asp Ala He Trp Leu Ala Arg Gly Gly 
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50 



55 



60 



65 



70 



1005 3 .1015 1020 

ATT GGG CTA GAA GCC ATC CGC ACC GTG TCG CTG GAT ACC CTG TTT GGC 3172 

lie Gly Leu Glu Ala He Arg Thr Val Ser Leu Asp Thr Leu Phe Gly 
1025 1030 1035 

ACA GGG ACG TTA AGT GAA AAT ATC AAT AAA GTG CTT AAC GGG GAA ACG 3220 

Thr Gly Thr Leu Ser Glu Asn He Asn Lys Val Leu Asn Glv Glu Thr 

1040 1045 1050 

GTA TCT CCA TCC GGT GGC GTC ACT CTG GCG CTG ACA GGG GAT ATC TTC 3268 

Val Ser Pro Ser Gly Gly Val Thr Leu Ala Leu Thr Gly Asp He Phe 
1055 1060 1065 



15 CAA GCA ACA CTG GAT TTG AGT CAG CTA GGT TTG GAT AAC TCT TAC AAC 3 316 

Gin Ala Thr Leu Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn 
1070 1075 1080 

TTG GGT AAC GAG AAG AAA CGT CGT ATT AAA CGT ATC GCC GTC ACC CTG 3364 

20 Leu Gly Asn Glu Lys Lys Arg Arg He Lys Arg He Ala Val Thr Leu 
1085 1090 1095 1100 

CCA ACA CTT CTG GGG CCA TAT CAA GAT CTT GAA GCC ACA CTG GTA ATG 3412 

Pro Thr Leu Leu Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met 
25 1105 1110 1115 

GGT GCG GAA ATC GCC GCC TTA TCA CAC GGT GTG AAT GAC GGA GGC CGG 34 6 0 

Gly Ala Glu He Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg 
1120 1125 1130 



TTT GTT ACC GAC TTT AAC GAC AGC CGT TTT CTG CCT TTT GAA GGT CGA 3 5 08 
Phe Val Thr Asp Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg 
1135 1140 1145 



3 5 GAT GCA ACA ACC GGC ACA CTG GAG CTC AAT ATT TTC CAT GCG GGT AAA 3 556 

Asp Ala Thr Thr Gly Thr Leu Glu Leu Asn He Phe His Ala Gly Lys 
1150 1155 1160 

GAG GGA ACG CAA CAC GAG TTG GTC GCG AAT CTG AGT GAC ATC ATT GTG 3 6 04 

4 0 Glu Gly Thr Gin His Glu Leu Val Ala Asn Leu Ser Asp He He Val 

1165 1170 1175 1180 



45 



CAT CTG AAT TAC ATC ATT CGA GAC GCG TAA ATTTCTTTTC TTTGTCGATT 3 654 
His Leu Asn Tyr He He Arg Asp Ala * 

1185 1190 


ACAGGTCCCT 


ATCAGGGGCC 


TGTTATTAAG 


GAGTACTTTA 


TGCAGGATTC 


ACCAGAAGTA 


3714 


TCGATTACAA 


CG CTGTC ACT 


TCCCAAAGGT 


GGCGGTGCTA 


TCAATGG CAT 


GGGAGAAGCA 


3774 


CTGAATGCTG 


CCGGCCCTGA 


TGGAATGGCC 


TCCCTATCTC 


TGCCATTACC 


CCTTTCGACC 


3834 


GGCAGAGGGA 


CGGCTCCTGG 


ATTATCGCTG 


ATTT AC AG C A 


ACAGTGCAGG 


TAATGGGCCT 


3894 


TTCGGCATCG 


G CTGGC AATG 


CG GTGTT ATG 


TCCATTAGCC 


GACGCACCCA 


ACATGG C ATT 


3954 


CC AC AAT ACG 


GTAATGACGA 


CACGTTCCTA 


TCCCCACAAG 


GCGAGGTCAT 


GAATATCGCC 


4014 


CTG AATG AC C 


AAGGGCAACC 


TGATATCCGT 


CAAGACGTTA 


AAACGCTG C A 


AGGCGTTACC 


4074 


TTGCCAATTT 


CCTATACCGT 


GACCCGCTAT 


CAAGCCCGCC 


AGATCCTGGA 


TTTCAGTAAA 


4134 


AT CG AAT ACT 


GGCAACCTGC 


CTCCGGTCAA 


GAAGGACGCG 


CTTTCTGGCT 


GATATCGACA 


4194 


CCGGACGGGC 


ATCTACACAT 


CTTAGGGAAA 


ACCGCGCAGG 


CTTGTCTGGC 


AAATC CG CAA 


4254 


AATGACCAAC 


AAATCGCCCA 


GTGGTTGCTG 


GAAGAAACTG 


TG ACG C CAG C 


CGGTGAACAT 


4314 


GTCAGCTATC 


AATATCGAGC 


CGAAGATGAA 


GCCCATTGTG 


ACGACAATGA 


AAAAACCGCT 


4374 


CATCCCAATG 


TT AC CG CAC A 


GCGCTATCTG 


GTACAGGTGA 


ACTACAGGCA 


ACATCAAACC 


4434 
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ACAAGCCAGC CTGTTCGTAC TGGATAACGC ACCTCCCGCA CCGGAAGAGT GGCTGTTTCA 4 4 94 
TCTGGTCTTT GACCACGGTG AGCGCGTACC TCACTTCATA CCGTGCCAAC ATGGGATGCA 4 554 

5 

GGTACAGCGC AATGGTCTGT ACGCCCGGAT ATCTTCTCTC GCTATGAATA TGGTTTTGAA 4 614 
GTGCGTACTC GCCGCTTATG TCAACAAGTG CTGATGTTTC ACCGCACCGC GCTCATGGCC 4 674 
10 GGAGAAG CC A GTACCAATGA CGCCCCGGAA CTGGTTGGAC GCTTAATACT GGAATATGAC 4 634 
AAAAACGCCA GCGTCACCAC GTTGATTACC ATCCGTCAAT T AAG CCATGA ATCGGACGGG 4 7 94 
AGGCCAGTCA CCCAGCCACC ACTAGAACTA GCCTGGCAAC GGTTTGATCT GGAGAAAATC 4 8 54 

15 

CCGACATGGC AACGCTTTGA CGCACTAGAT AATTTTAACT CGCAGCAACG TTATCAACTG 4 865 
GTTGATCTGC GGGGAGAAGG GTTGCCAGGT ATGCTGTATC AAGATCGAGG CGCTTGGTGG 4 914 
20 TATAAAGCTC CGCAACGTCA GGAAGACGGA GACAGCAATG CCGTCACTTA CGACAAAATC 4 974 
GCCCCACTGC CTACCCTACC C AATTTG C AG GATAATGCCT CATTGATGGA TATCAACGGA 5034 
GACGGCCAAC TGGATTGGGT TGTTACCGCC TCCGGTATTC GCGGATACCA TAGTCAGCAA 50 94 

25 

CCCGATGGAA AGTGGACGCA CTTTACGCCA ATCAATGCCT TGCCCGTGGA ATATTTTCAT 5214 
CCAAGCATCC AGTTCGCTGA CCTTACCGGG G C AGGCTT AT CTGATTTAGT GTTGATCGGG 527 4 
30 CCGAAAAGCG TGCGTCTATA TGCCAACCAG CGAAACGGCT GGCGTAAAGG AGAAGATGTC 5334 
CCCCAATCCA CAGGTATCAC CCTGCCTGTC ACAGGGACCG ATGCCCGCAA ACTGGTGGCT 5 3 94 
TTCAGTGATA TGCTCGGTTC CGGTCAACAA CATCTGGTGG AAATCAAGGG TAATCGCGTC 54 54 

35 

ACCTGTTGGC CGAATCTAGG GCATGGCCGT TTCGGTCAAC CACTAACTCT GTCAGGATTT 5 514 
AGCCAGCCCG AAAATAGCTT CAATCCCGAA CGGCTGTTTC TGGCGGATAT CGACGGCTCC 5 574 
4 0 GGCACCACCG ACCTTATCTA TGCGCAATCC GGCTCTTTGC TCATTTATCT CAACCAAAGT 5 63 4 
GGTAATCAGT TTGATGCCCC GTTGACATTA GCGTTGCCAG AAGGCGTACA ATTTGACAAC 5 6 94 
ACTTGCCAAC TTCAAGTCGC CGATATTCAG GGATTAGGGA TAGCCAGCTT GATTCTGACT 5 754 

45 

GTGCCACATA TCGCGCCACA TCACTGGCGT TGTGACCTGT CACTGACCAA ACCCTGGTTG 5814 
TTGAATGTAA TGAACAATAA CCGGGGCGCA CATCACACGC TACATTATCG TAGTTCCGCG 5B74 
50 CAATTCTGGT TGGATGAAAA ATTACAGCTC ACCAAAGCAG GCAAATCTCC GGCTTGTTAT 5 93 4 
CTGCCGTTTC CAATGCATTT GCTATGGTAT ACCGAAATTC AGGATGAAAT CAGCGGCAAC 5 994 
CGGCTCACCA GTGAAGTCAA CTACAGCCAC GGCGTCTGGG ATGGTAAAGA GCGGGAATTC 6 054 

55 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 118 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

65 (xi) SEQUENCE DESCRIPTION : SEQ ID NO:26 (TcaB protein): 

Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 
15 10 15 
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Ala Leu Val Ala His 
20 



lie Ala Thr Gin Val Pro Ala A" 

25 




,eu Lys 



30 



5 
10 
15 
20 
25 
30 
35 
40 
4 5 
50 
55 
60 
65 

70 



Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 
35 40 45 

Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 
50 55 60 

Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 
65 70 75 80 

Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 
85 90 95 

Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 
100 105 110 

Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 
115 120 125 

Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 
130 135 140 

Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 
145 150 155 160 

He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 
165 170 175 

Gly Lys Asp Asn Lys Thr lie Phe Phe He Gly Arg Thr Gin Asn Ala 
180 185 " 190 

Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 
195 200 205 

Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Gly 
210 215 220 

He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe Trp Glu Asn Asn 
225 230 235 240 

Lys Leu His lie Arg Trp Phe Thr He Ser Lys Glu Asp Lys lie Asp 
245 250 255 

Phe Val Tyr Lys Asn lie Trp Val Met Ser Ser Asp Tyr Ser Trp Ala 
260 265 270 

Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 
275 280 285 

Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly 
290 295 300 

Ser Asp Ala Gin Met Asn lie Ser Asp Asp Gly Thr Val Leu lie Phe 
305 310 315 320 

Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 
325 330 335 

Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 
340 345 350 

Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly 
355 360 365 

Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser lie Asn 
370 375 380 

Thr lie Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 
385 390 395 400 
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10 



40 



55 



70 



Phe Thr Pro Pro Ser Gly Ser Ala lie Asp Leu His Leu Pro Asn Tyr 
405 410 415 

Val Asp Leu Asn Ala Leu Leu Asp lie Ser Leu Asp Ser Leu Leu Asn 
420 425 430 

Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 
435 440 445 

Ser Gly Pro Tyr Gly lie Tyr Leu Trp Glu lie Phe Phe His lie Pro 
450 455 460 



Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala 
15 465 470 475 480 

Asp Thr Trp Tyr Lys Tyr lie Phe Arg Ser Ala Gly Tyr Arg Asp Ala 
485 490 495 

2 0 Asn Gly Gin Leu lie Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 
500 505 510 

Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 
515 520 525 

25 

Thr Asp Pro Asp Val lie Ala Met Ala Asp Pro Met His Tyr Lys Leu 
530 535 540 

Ala lie Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 
30 545 550 555 560 

Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Met Tyr 
565 570 575 

35 Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 
580 585 590 



Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala He 
595 600 605 

Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 

610 615 620 



Trp Leu Ser Ala Gly Asp Thr Ala Asn lie Gly Asp Gly Asp Phe Leu 

45 625 630 635 640 

Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu 

645 650 655 

50 Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu 

660 665 670 



Asn Leu Pro Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg 
675 680 685 

Gin Gin Ala Gly Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Gin 
690 695 700 



Gly Ser Val Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg 

60 705 710 715 720 

Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr 

725 730 735 

65 Leu Glu His Gin Asp Asn Glu Lys Met Thr He Leu Leu Gin Thr Gin 

740 745 750 



Gin Glu Ala He Leu Lys His Gin His Asp lie Gin Gin Asn Asn Leu 
755 760 765 

Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly 
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770 775 . 780 

Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu lie Asn Gly Gly Leu 
785 790 795 800 

5 

Ser Ala Ala Glu lie Ala Gly Leu Thr Leu Arg Ser Thr Ala Met lie 
805 810 815 

Thr Asn Gly Val Ala Thr Gly Leu Leu lie Ala Gly Gly lie Ala Asn 
10 820 B25 830 

Ala Val Pro Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly 
835 840 845 

15 Ala Pro Leu lie Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly lie 
850 855 860 



20 



35 



50 



65 



Gin Asp Gin Ser Ala Gly lie Ser Glu Val Thr Ala Gly Tyr Gin Arg 

865 870 875 880 

Arg Gin Glu Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He 

885 890 895 



Thr Gin Leu Asp Ala Gin He Gin Ser Leu Gin Glu Gin He Thr Met 

25 900 905 910 

Ala Gin Lys Gin He Thr Leu Ser Glu Thr Glu Gin Ala Asn Ala Gin 

915 920 925 

30 Ala lie Tyr Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala Leu Tyr 

930 935 940 



Asn Trp Met Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp 

945 ' 950 955 960 

Ser Thr Leu Pro lie Cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu 

965 970 975 



Leu Gly Glu Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn 
40 980 985 990 

Asp Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu 
995 1000 1005 

4 5 Gin Lys Leu Asp Ala He Trp Leu Ala Arg Gly Gly lie Gly Leu Glu 
1010 1015 1020 



Ala He Arg Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu 
1025 " 1030 1035 1040 

Ser Glu Asn He Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser 
1045 1050 1055 



Gly Gly Val Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr Leu 
55 1060 1065 1070 

Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu 
1075 1080 1085 

60 Lys Lys Arg Arg lie Lys Arg lie Ala Val Thr Leu Pro Thr Leu Leu 
1090 1095 1100 



Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met Gly Ala Glu He 
1105 1110 1115 1120 

Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp 
1125 1130 1135 



Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr 
70 1140 1145 1150 
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# 

S IjTU 



Gly Thr Leu GlWeu Asn lie Phe His Ala Gly Lys Wu Gly Thr Gin 
1155 1160 1165 

His Glu Leu Val Ala Asn Leu Ser Asp lie lie Val His Leu Asn Tyr 
1170 1175 1180 

He He Arg Asp Ala * 
1185 1190 



(2) INFORMATION FOR SEQ ID NO: 27: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1881 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 
2 0 (A) NAME / KEY : CDS 

(B) LOCATION: 1 . .1881 

(D) OTHER INFORMATION: tcaB; 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27 { tcaBi coding region) 

25 

ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA GCG CGC CGT GAT 4 8 
Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 
15 10 15 

30 GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC GCA GAT TTA AAA 96 
Ala Leu Val Ala His Tyr lie Ala Thr Gin Val Pro Ala Asp Leu Lys 
20 25 30 

GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT CTG TTG CTG GAT 144 
35 Glu Ser lie Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 
35 40 45 

ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG TCC GAA GCG ATT 192 
Thr Lys lie Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala lie 
40 50 55 60 

GGC AGT CTG CAA TTG TTT ATT CAT CGT GCG ATA GAG GGC TAT GAC GGC 24 0 

Gly Ser Leu Gin Leu Phe lie His Arg Ala lie Glu Gly Tyr Asp Gly 
65 70 75 80 

45 

ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT GAA CAG TTT TTA 2 88 

Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 
85 90 95 

50 TAT AAC TGG GAT AGT TTT AAC CAC CGT TAT AGC ACT TGG GCT GGC AAG 3 36 
Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 
100 105 110 

GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT CCA ACA TTG CGA 3 84 
5 5 Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr lie Asp Pro Thr Leu Arg 
115 120 125 

TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA GGT ATT TCT CAA 4 32 
Leu Asn Lys Thr Glu lie Phe Thr Ala Phe Glu Gin Gly lie Ser Gin 
60 130 135 140 

GGG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA CGT GAT TAT CTA 4 80 

Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 

145 150 155 160 

65 

ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT ACT GCC TGC CAA 52 8 

lie Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr lie Thr Ala Cvs Gin 
165 170 175 
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GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT ACA CAG AAT GCA 576 

Gly Lys Asp Asn Lys Thr lie Phe Phe lie Gly Arg Thr Gin Asn Ala 

180 * 185 190 

5 

CCC TAT GCA TTT TAT TGG CGA AAA TTA ACT TTA GTC ACT GAT GGC GGT 624 

Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 

195 200 205 

10 AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA GCA ATT AAT GCC GGG 672 
Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala lie Asn Ala Gly 
210 * 215 220 

ATT AGT GAG GCA TAT TCA GGG CAT GTC GAG CCT TTC TGG GAA AAT AAC 72 0 
15 lie Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe Trp Glu Asn Asn 
225 230 235 240 

AAG CTG CAC ATC CGT TGG TTT ACT ATC TCG AAA GAA GAT AAA ATA GAT 768 
Lys Leu His lie Arg Trp Phe Thr lie Ser Lys Glu Asp Lys lie Asp 
20 245 250 255 

TTT GTT TAT AAA AAC ATC TGG GTG ATG AGT AGC GAT TAT AGC TGG GCA 816 
Phe Val Tyr Lys Asn lie Trp Val Met Ser Ser Asp Tyr Ser Trp Ala 
260 265 270 



25 



45 



65 



TCA AAG AAA AAA ATC TTG GAA CTT TCT TTT ACT GAC TAC AAT AGA GTT 8 64 
Ser Lys Lys Lys lie Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 
275 280 285 



30 GGA GCA ACA GGA TCA TCA AGC CCG ACT GAA GTA GCT TCA CAA TAT GGT 912 

Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly 

290 295 300 

TCT GAT GCT CAG ATG AAT ATT TCT GAT GAT GGG ACT GTA CTT ATT TTT 96 0 

35 Ser Asp Ala Gin Met Asn lie Ser Asp Asp Gly Thr Val Leu He Phe 
305 310 315 320 

CAG AAT GCC GGC GGA GCT ACT CCC AGT ACT GGA GTG ACG TTA TGT TAT 100 8 

Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 
40 325 330 335 

GAC TCT GGC AAC GTG ATT AAG AAC CTA TCT AGT ACA GGA AGT GCA AAT 1056 

Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 
340 345 350 



TTA TCG TCA AAG GAT TAT GCC ACA ACT AAA TTA CGC ATG TGT CAT GGA 1104 
Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly 
355 360 365 



50 CAA AGT TAC AAT GAT AAT AAC TAC TGC AAT TTT ACA CTC TCT ATT AAT 1152 
Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser*rt?' Asn 
370 375 380 

ACA ATA GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA GAT GGA AAA CAA 12 00 
5 5 Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 
385 390 395 400 

TTT ACA CCA CCT TCT GGT TCT GCC ATT GAT TTA CAC CTC CCT AAT TAT 12 4 8 
Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 
60 405 410 415 

GTA GAT CTC AAC GCG CTA TTA GAT ATT AGC CTC GAT TCA CTA CTT AAT 12 96 
Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn 
420 ~ 425 430 



TAT GAC GTT CAG GGG CAG TTT GGC GGA TCT AAT CCG GTT GAT AAT TTC 134 4 
Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 
435 440 445 



7 0 AGT GGT CCC TAT GGT ATT TAT CTA TGG GAA ATC TTC TTC CAT ATT CCG 1392 
Ser Gly Pro Tyr Gly He Tyr Leu Trp Glu He Phe Phe His lie Pro 
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10 



30 



m 



450 455 460 

TTC CTT GTT ACG GTC CGT ATG CAA ACC GAA CAA CGT TAC GAA GAC GCG 1440 

Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala 

465 470 475 480 

GAC ACT TGG TAC AAA TAT ATT TTC CGC AGC GCC GGT TAT CGC GAT GCT I4 86 

Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly Tyr Arg Asp Ala 
485 490 495 

AAT GGC CAG CTC ATT ATG GAT GGC AGT AAA CCA CGT TAT TGG AAT GTG 15 36 

Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 
500 505 510 

15 ATG CCA TTG CAA CTG GAT ACC GCA TGG GAT ACC ACA CAG CCC GCC ACC 1584 

Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 
515 520 525 

ACT GAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG CAT TAC AAG CTG 163 2 

20 Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 
530 535 540 

GCG ATA TTC CTG CAT ACC CTT GAT CTA TTG ATT GCC CGA GGC GAC AGC 1680 

Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 

25 545 550 555 560 

GCT TAC CGT CAA CTT GAA CGC GAT ACT CTA GTC GAA GCC AAA ATG TAC 172 8 

Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Met Tyr 
565 570 575 



TAC ATT CAG GCA CAA CAG CTA CTG GGA CCG CGC CCT GAT ATC CAT ACC 1776 
Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 
580 585 590 



3 5 ACC AAT ACT TGG CCA AAT CCC ACC TTG AGT AAA GAA GCT GGC GCT ATT 1824 

Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala He 
595 600 * 605 

GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG GTG ATG ACG TTC GCT GCC 18 72 

4 0 Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 

610 615 620 

TGG CTA AGC 1881 
Trp Leu Ser 
45 625 



(2) INFORMATION FOR SEQ ID NO: 28: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 627 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



55 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 (TcaBi protein) : 

Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 
15 10 15 

Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro Ala Asp Leu Lys 
20 25 30 



Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 
65 35 40 45 

Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 
50 55 60 
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15 



30 



45 



60 



65 



He His Arg Ala. He Glu Gly 



Gly Ser Leu Gin Leu^Pfe He His Arg Ala He Glu Gly l^^Asp Gly 

65 70 75 80 

Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 
5 85 90 95 

Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 
100 105 110 

10 Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 

115 120 125 



Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly lie Ser Gin 

130 135 140 

Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 

145 150 155 160 



He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 
20 165 170 175 

Gly Lys Asp Asn Lys Thr He Phe Phe lie Gly Arg Thr Gin Asn Ala 
180 185 190 

2 5 Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 

195 200 205 



Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Gly 

210 215 220 

He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe Trp Glu Asn Asn 

225 230 235 240 



Lys Leu His He Arg Trp Phe Thr lie Ser Lys Glu Asp Lys He Asp 

35 245 250 255 

Phe Val Tyr Lys Asn lie Trp Val Met Ser Ser Asp Tyr Ser Trp Ala 

260 265 270 

4 0 Ser Lys Lys Lys lie Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 
275 280 285 



Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly 

290 295 300 

Ser Asp Ala Gin Met Asn lie Ser Asp Asp Gly Thr Val Leu lie Phe 

305 310 315 320 



Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 
50 325 330 335 

Asp Ser Gly Asn Val lie Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 
340 345 350 

55 Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly 
355 360 365 



Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser lie Asn 
370 375 380 

Thr lie Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 
385 390 395 400 

Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 
405 410 415 

Val Asp Leu Asn Ala Leu Leu Asp lie Ser Leu Asp Ser Leu Leu Asn 
420 425 ' 430 

70 Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 
435 440 445 
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Ser Gly Pro Tyr Gly He Tyr Leu Trp Glu He Phe Phe His He Pro 
450 455 460 

Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala 
465 470 475 480 

Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly Tyr Arg Asp Ala 
485 490 495 

Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 
500 505 510 



Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 
15 515 520 525 

Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 
530 535 540 

20 Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 
545 550 555 560 

Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Met Tyr 
^ 5 5 " 570 575 

Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 
580 585 590 

Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala He 
30 595 600 605 

Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 
610 615 620 

35 Trp Leu Ser 
625 



(2) INFORMATION FOR SEQ ID NO: 29; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 9 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : double 
4 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

CA) NAME /KEY : CDS 

(B) LOCATION: 1..1689 

50 (D) OTHER INFORMATION: tcaB i:L 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 (tcaB l2 coding regaion) 

GCA GGC GAT ACC GCA AAT ATT GGC GAC GGT GAT TTC TTG CCA CCG TAC 4 8 
bb Ala Gly Asp Thr Ala Asn lie Gly Asp Gly Asp Phe Leu Pro Pro Tyr 
15 10 15 

AAC GAT GTA CTA CTC GGT TAC TGG GAT AAA CTT GAG TTA CGC CTA TAC 96 
Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu Arg Leu Tyr 
bO 20 25 30 



AAC CTG CGC CAC AAT CTG AGT CTG GAT GCT CAA CCG CTA AAT CTG CCA 14 4 
Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu Asn Leu Pro 
35 40 J 45 

CTG TAT GCC ACG CCG GTA GAC CCG AAA ACC CTG CAA CGC CAG CAA GCC 192 
Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala 
50 55 60 

-186- 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 9808932A1_I_> 



WO 98/08932 



PCT/US97/07657 



GGA GGG GAC GGT ACA GGC AGT AGT CCG GCT GGT GOT CAA GGC AGT GTT 240 
Glv Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Gin Gly Ser Val 
55 y 70 75 80 

CAG GGC TGG CGC TAT CCG TTA TTG GTA GAA CGC GCC CGC TCT GCC GTG 288 
Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg Ser Ala Val 
85 90 95 

10 AGT TTG TTG ACT CAG TTC GGC AAC AGC TTA CAA ACA ACG TTA GAA CAT 336 
Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr Leu Glu His - 
100 105 110 

CAG GAT AAT GAA AAA ATG ACG ATA CTG TTG CAG ACT CAA CAG GAA GCC 3 84 
15 Gin Asp Asn Glu Lys Met Thr lie Leu Leu Gin Thr Gin Gin Glu Ala 
115 * 120 125 

ATC CTG AAA CAT CAG CAC GAT ATA CAA CAA AAT AAT CTA AAA GGA TTA 4 32 
lie Leu Lys His Gin His Asp He Gin Gin Asn Asn Leu Lys Gly Leu 
20 130 135 140 

CAA CAC AGC CTG ACC GCA TTA CAG GCT AGC CGT GAT GGC GAC ACA TTG 4 80 

Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly Asp Thr Leu 
145 150 155 160 

25 

CGG CAA AAA CAT TAC AGC GAC CTG ATT AAC GGT GGT CTA TCT GCG GCA 52 8 

Arg Gin Lys His Tyr Ser Asp Leu He Asn Gly Gly Leu Ser Ala Ala 

165 17 0 17 5 

30 GAA ATC GCC GGT CTG ACA CTA CGC AGC ACC GCC ATG ATT ACC AAT GGC 576 
Glu He Ala Gly Leu Thr Leu Arg Ser Thr Ala Met He Thr Asn Gly 
180 185 190 

GTT GCA ACG GGA TTG CTG ATT GCC GGC GGA ATC GCC AAC GCG GTA CCT 6 24 
3 5 Val Ala Thr Gly Leu Leu He Ala Gly Gly lie Ala Asn Ala Val Pro 
195 ^ 200 205 

AAC GTC TTC GGG CTG GCT AAC GGT GGA TCG GAA TGG GGA GCG CCA TTA 6 72 
Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly Ala Pro Leu 
40 210 215 220 

ATT GGC TCC GGG CAA GCA ACC CAA GTT GGC GCC GGC ATC CAG GAT CAG 720 
He Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly He Gin Asp Gin 
225 ' * 230 235 240 



AS 



65 



AGC GCG GGC ATT TCA GAA GTG ACA GCA GGC TAT CAG CGT CGT CAG GAA 76 8 
Ser Ala Gly He Ser Glu Val Thr Ala Gly Tyr Gin Arg Arg Gin Glu 
245 250 255 



50 GAA TGG GCA TTG CAA CGG GAT ATT GCT GAT AAC GAA ATA ACC CAA CTG 816 

Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He Thr Gin Leu 
260 265 270 

GAT GCC CAG ATA CAA AGC CTG CAA GAG CAA ATC ACG ATG GCA CAA AAA 8 64 

5 5 Asp Ala Gin He Gin Ser Leu Gin Glu Gin He Thr Met Ala Gin Lys 

275 280 285 

CAG ATC ACG CTC TCT GAA ACC GAA CAA GCG AAT GCC CAA GCG ATT TAT 912 

Gin He Thr Leu Ser Glu Thr Glu Gin Ala Asn Ala Gin Ala He Tyr 
60 290 295 300 

GAC CTG CAA ACC ACT CGT TTT ACC GGG CAG GCA CTG TAT AAC TGG ATG 960 

Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala Leu Tyr Asn Trp Met 
305 310 315 320 



GCC GGT CGT CTC TCC GCG CTC TAT TAC CAA ATG TAT GAT TCC ACT CTG 1008 
Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp Ser Thr Leu 
325 330 335 



7 0 CCA ATC TGT CTC CAG CCA AAA GCC GCA TTA GTA CAG GAA TTA GGC GAG 1056 
Pro He Cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu Leu Gly Glu 
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340^ 345 350 

AAA GAG AGC GAC AGT CTT TTC CAG GTT CCG GTG TGG AAT GAT CTG TGG 1104 

Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn Asp Leu Trp 
355 360 365 

CAA GGG CTG TTA GCA GGA GAA GGT TTA AGT TCA GAG CTA CAG AAA CTG 1152 

Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu Gin Lys Leu 
370 375 380 

GAT GCC ATC TGG CTT GCA CGT GGT GGT ATT GGG CTA GAA GCC ATC CGC 1200 

Asp Ala lie Trp Leu Ala Arg Gly Gly He Gly Leu Glu Ala He Arg 
385 390 395 400 



15 ACC GTG TCG CTG GAT ACC CTG TTT GGC ACA GGG ACG TTA AGT GAA AAT 124 8 
Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu Ser Glu Asn 
405 410 415 

ATC AAT AAA GTG CTT AAC GGG GAA ACG GTA TCT CCA TCC GGT GGC GTC 12 96 
2 0 He Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser Gly Gly Val 
420 425 430 

ACT CTG GCG CTG ACA GGG GAT ATC TTC CAA GCA ACA CTG GAT TTG AGT 1344 
Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr Leu Asp Leu Ser 
25 435 440 445 

CAG CTA GGT TTG GAT AAC TCT TAC AAC TTG GGT AAC GAG AAG AAA CGT 13 92 
Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu Lys Lys Arg 
4 50 455 460 



CGT ATT AAA CGT ATC GCC GTC ACC CTG CCA ACA CTT CTG GGG CCA TAT 144 0 
Arg He Lys Arg He Ala Val Thr Leu Pro Thr Leu Leu Gly Pro Tyr 
465 470 475 480 



35 CAA GAT CTT GAA GCC ACA CTG GTA ATG GGT GCG GAA ATC GCC GCC TTA 14 8 8 
Gin Asp Leu Glu Ala Thr Leu Val Met Gly Ala Glu He Ala Ala Leu 
485 490 495 

TCA CAC GGT GTG AAT GAC GGA GGC CGG TTT GTT ACC GAC TTT AAC GAC 15 36 
4 0 Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp Phe Asn Asp 
500 505 510 

AGC CGT TTT CTG CCT TTT GAA GGT CGA GAT GCA ACA ACC GGC ACA CTG 1584 
Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr Gly Thr Leu 
45 515 520 525 

GAG CTC AAT ATT TTC CAT GCG GGT AAA GAG GGA ACG CAA CAC GAG TTG 163 2 
Glu Leu Asn lie Phe His Ala Gly Lys Glu Gly Thr Gin His Glu Leu 
530 535 540 



GTC GCG AAT CTG AGT GAC ATC ATT GTG CAT CTG AAT TAC ATC ATT CGA 1680 
Val Ala Asn Leu Ser Asp He He Val His Leu Asn Tyr He lie Arg 
545 550 555 560 



55 GAC GCG TAA 1689 
Asp Ala * 



60 (2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 562 amino acids 

(B) TYPE: amino acid 
65 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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10 
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20 
25 
30 
35 
40 
45 
50 
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60 
65 
70 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30 (Tcalff protein) : 

Ala Gly Asp Thr Ala Asn lie Gly Asp Gly Asp Phe Leu Pro Pro Tyr 
1 Ji 5 10 15 

Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu Arg Leu Tyr 
20 25 30 

Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu Asn Leu Pro 
35 40 45 

Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala 
50 55 60 

Gly Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Gin Gly Ser Val 
65 70 75 80 

Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg Ser Ala Val 
85 90 *" 95 

Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr Leu Glu His 
100 105 110 

Gin Asp Asn Glu Lys Met Thr lie Leu Leu Gin Thr Gin Gin Glu Ala 
115 120 125 

lie Leu Lys His Gin His Asp lie Gin Gin Asn Asn Leu Lys Gly Leu 
130 135 140 

Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly Asp Thr Leu 
145 150 155 160 

Arg Gin Lys His Tyr Ser Asp Leu lie Asn Gly Gly Leu Ser Ala Ala 
165 170 175 

Glu lie Ala Gly Leu Thr Leu Arg Ser Thr Ala Met He Thr Asn Gly 
1B0 185 190 

Val Ala Thr Gly Leu Leu He Ala Gly Gly He Ala Asn Ala Val Pro 
195 200 205 

Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly Ala Pro Leu 
210 215 220 

He Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly He Gin Asp Gin 
225 230 235 240 

Ser Ala Gly He Ser Glu Val Thr Ala Gly Tyr Gin Arg Arg Gin Glu 
245 250 255 

Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He Thr Gin Leu 
260 265 270 

Asp Ala Gin He Gin Ser Leu Gin Glu Gin He Thr Met Ala Gin Lys 
275 280 285 

Gin He Thr Leu Ser Glu Thr Glu Gin Ala Asn Ala Gin Ala He Tyr 
290 295 300 

Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala Leu Tyr Asn Trp Met 
305 310 315 320 

Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp Ser Thr Leu 
325 330 335 

Pro lie Cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu Leu Gly Glu 
340 345 350 

Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn Asp Leu Trp 
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Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu Gin Lys Leu 
370 375 380 

Asp Ala lie Trp Leu Ala Arg Gly Gly lie Gly Leu Glu Ala lie Arg 
385 390 ' 395 400 

Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu Ser Glu Asn 
405 410 415 

lie Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser Gly Gly Val 
420 425 430 



Thr Leu Ala Leu Thr Gly Asp lie Phe Gin Ala Thr Leu Asp Leu Ser 

15 435 440 445 

Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu Lys Lys Arg 
450 455 460 

2 0 Arg lie Lys Arg lie Ala Val Thr Leu Pro Thr Leu Leu Gly Pro Tyr 

465 470 475 480 



Gin Asp Leu Glu Ala Thr Leu Val Met Gly Ala Glu lie Ala Ala Leu 

485 490 495 

Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp Phe Asn Asp 

500 505 510 



Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr Gly Thr Leu 

30 515 520 525 

Glu Leu Asn lie Phe His Ala Gly Lys Glu Gly Thr Gin His Glu Leu 
530 535 540 

3 5 Val Ala Asn Leu Ser Asp lie lie Val His Leu Asn Tyr He He Arg 

545 550 555 560 



Asp Ala 



(2) INFORMATION FOR SEQ ID NO: 31: 



(i) SEQUENCE CHARACTERISTICS: 
4b (A) LENGTH: 4458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
50 (ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..4458 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 1 (tcac gene): 

ATG CAG GAT TCA CCA GAA GTA TCG ATT ACA ACG CTG TCA CTT CCC AAA 4 8 
Met Gin Asp Ser Pro Glu Val Ser lie Thr Thr Leu Ser Leu Pro Lys 
15 10 15 



60 GGT GGC GGT GCT ATC AAT GGC ATG GGA GAA GCA CTG AAT GCT GCC GGC 9 6 
Gly Gly Gly Ala lie Asn Gly Met Gly Glu Ala Leu Asn Ala Ala Gly 
20 25 30 

CCT GAT GGA ATG GCC TCC CTA TCT CTG CCA TTA CCC CTT TCG ACC GGC 144 
65 Pro Asp Gly Met Ala Ser Leu Ser Leu Pro Leu Pro Leu Ser Thr Gly 
35 40 45 

AGA GGG ACG GCT CCT GGA TTA TCG CTG ATT TAC AGC AAC AGT GCA GGT 192 
Arg Gly Thr Ala Pro Gly Leu Ser Leu lie Tyr Ser Asn Ser Ala Gly 
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50 55 60 

AAT GGG CCT TTC GGC ATC GGC TGG CAA TGC GGT GTT ATG TCC ATT AGC 24 0 

Asn Gly Pro Phe Gly lie Gly Trp Gin Cys Gly Val Met Ser lie Ser 
65 70 75 60 

CGA CGC ACC CAA CAT GGC ATT CCA CAA TAC GGT AAT GAC GAC ACG TTC 2 88 

Arg Arg Thr Gin His Gly He Pro Gin Tyr Gly Asn Asp Asp Thr Phe 

85 90 " 95 

CTA TCC CCA CAA GGC GAG GTC ATG AAT ATC GCC CTG AAT GAC CAA GGG 3 36 

Leu Ser Pro Gin Gly Glu Val Met Asn He Ala Leu Asn Asp Gin Gly 

100 105 110 



15 CAA CCT GAT ATC CGT CAA GAC GTT AAA ACG CTG CAA GGC GTT ACC TTG 3 84 

Gin Pro Asp He Arg Gin Asp Val Lys Thr Leu Gin Gly Val Thr Leu 

115 120 125 

CCA ATT TCC TAT ACC GTG ACC CGC TAT CAA GCC CGC CAG ATC CTG GAT 432 

20 Pro He Ser Tyr Thr Val Thr Arg Tyr Gin Ala Arg Gin He Leu Asp 
130 135 140 

TTC AGT AAA ATC GAA TAC TGG CAA CCT GCC TCC GGT CAA GAA GGA CGC 4 80 

Phe Ser Lys He Glu Tyr Trp Gin Pro Ala Ser Gly Gin Glu Gly Arg 
25 145 150 155 160 

GCT TTC TGG CTG ATA TCG ACA CCG GAC GGG CAT CTA CAC ATC TTA GGG 5 28 

Ala Phe Trp Leu He Ser Thr Pro Asp Gly His Leu His lie Leu Gly 
165 170 175 

30 

AAA ACC GCG CAG GCT TGT CTG GCA AAT CCG CAA AAT GAC CAA CAA ATC 576 

Lys Thr Ala Gin Ala Cys Leu Ala Asn Pro Gin Asn Asp Gin Gin He 
180 185 190 

35 GCC CAG TGG TTG CTG GAA GAA ACT GTG ACG CCA GCC GGT GAA CAT GTC 624 

Ala Gin Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Val 

195 200 205 

AGC TAT CAA TAT CGA GCC GAA GAT GAA GCC CAT TGT GAC GAC AAT GAA 672 

AO Ser Tyr Gin Tyr Arg Ala Glu Asp Glu Ala His Cys Asp Asp Asn Glu 
210 215 220 

AAA ACC GCT CAT CCC AAT GTT ACC GCA CAG CGC TAT CTG GTA CAG GTG 720 

Lys Thr Ala His Pro Asn Val Thr Ala Gin Arg Tyr Leu Val Gin Val 
45 225 230 235 240 

AAC TAC GGC AAC ATC AAA CCA CAA GCC AGC CTG TTC GTA CTG GAT AAC 768 

Asn Tyr Gly Asn He Lys Pro Gin Ala Ser Leu Phe Val Leu Asp Asn 
245 250 255 



GCA CCT CCC GCA CCG GAA GAG TGG CTG TTT CAT CTG GTC TTT GAC CAC 816 
Ala Pro Pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His 
260 265 270 



5 5 GGT GAG CGC GAT ACC TCA CTT CAT ACC GTG CCA ACA TGG GAT GCA GGT 8 64 

Gly Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Ala Gly 

275 280 285 

ACA GCG CAA TGG TCT GTA CGC CCG GAT ATC TTC TCT CGC TAT GAA TAT 912 

60 Thr Ala Gin Trp Ser Val Arg Pro Asp He Phe Ser Arg Tyr Glu Tyr 
290 295 300 

GGT TTT GAA GTG CGT ACT CGC CGC TTA TGT CAA CAA GTG CTG ATG TTT 960 

Gly Phe Glu Val Arg Thr Arg Arg Leu Cys Gin Gin Val Leu Met Phe 
65 305 310 315 320 

CAC CGC ACC GCG CTC ATG GCC GGA GAA GCC AGT ACC AAT GAC GCC CCG 1008 

His Arg Thr Ala Leu Met Ala Gly Glu Ala Ser Thr Asn Asp Ala Pro 
325 330 335 



GAA CTG GTT GGA CGC TTA ATA CTG GAA TAT GAC AAA AAC GCC AGC GTC 10 5 6 
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Glu Leu Val Gly^lPg Leu lie Leu Glu Tyr J\sp Lys AsTTAla Ser Val 

340 345 350 

ACC ACG TTG ATT ACC ATC CGT CAA TTA AGC CAT GAA TCG GAC GGG AGG 1104 

5 Thr Thr Leu He Thr He Arg Gin Leu Ser His Glu Ser Asp Gly Arg 
355 360 365 

CCA GTC ACC CAG CCA CCA CTA GAA CTA GCC TGG CAA CGG TTT GAT CTG 1152 

Pro Val Thr Gin Pro Pro Leu Glu Leu Ala Trp Gin Arg Phe Asp Leu 
10 370 375 380 



15 



GAG AAA ATC CCG ACA TGG CAA CGC TTT GAC GCA CTA GAT AAT TTT AAC 12 00 

Glu Lys He Pro Thr Trp Gin Arg Phe Asp Ala Leu Asp Asn Phe Asn 
385 390 395 400 

TCG CAG CAA CGT TAT CAA CTG GTT GAT CTG CGG GGA GAA GGG TTG CCA 124 8 

Ser Gin Gin Arg Tyr Gin Leu Val Asp Leu Arg Gly Glu Gly Leu Pro 
405 410 415 

20 GGT ATG CTG TAT CAA GAT CGA GGC GCT TGG TGG TAT AAA GCT CCG CAA 12 96 

Gly Met Leu Tyr Gin Asp Arg Gly Ala Trp Trp Tyr Lys Ala Pro Gin 
420 425 430 

CGT CAG GAA GAC GGA GAC AGC AAT GCC GTC ACT TAC GAC AAA ATC GCC 13 44 

2 5 Arg Gin Glu Asp Gly Asp Ser Asn Ala Val Thr Tyr Asp Lys He Ala 
435 440 445 

CCA CTG CCT ACC CTA CCC AAT TTG CAG GAT AAT GCC TCA TTG ATG GAT 13 92 

Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn Ala Ser Leu Met Asp 

30 450 455 460 



35 



55 



ATC AAC GGA GAC GGC CAA CTG GAT TGG GTT GTT ACC GCC TCC GGT ATT 14 4 0 

He Asn Gly Asp Gly Gin Leu Asp Trp Val Val Thr Ala Ser Gly He 

465 470 475 480 

CGC GGA TAC CAT AGT CAG CAA CCC GAT GGA AAG TGG ACG CAC TTT ACG 14 8 8 

Arg Gly Tyr His Ser Gin Gin Pro Asp Gly Lys Trp Thr His Phe Thr 

485 490 495 



4 0 CCA ATC AAT GCC TTG CCC GTG GAA TAT TTT CAT CCA AGC ATC CAG TTC 15 36 
Pro He Asn Ala Leu Pro Val Glu Tyr Phe His Pro Ser He Gin Phe 
500 505 510 

GCT GAC CTT ACC GGG GCA GGC TTA TCT GAT TTA GTG TTG ATC GGG CCG 1584 
4 5 Ala Asp Leu Thr Gly Ala Gly Leu Ser Asp Leu Val Leu He Gly Pro 
515 520 525 

AAA AGC GTG CGT CTA TAT GCC AAC CAG CGA AAC GGC TGG CGT AAA GGA 1632 
Lys Ser Val Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 
50 530 535 540 

GAA GAT GTC CCC CAA TCC ACA GGT ATC ACC CTG CCT GTC ACA GGG ACC 1680 
Glu Asp Val Pro Gin Ser Thr Gly He Thr Leu Pro Val Thr Gly Thr 
545 550 555 560 



GAT GCC CGC AAA CTG GTG GCT TTC AGT GAT ATG CTC GGT TCC GGT CAA 172 8 
Asp Ala Arg Lys Leu Val Ala Phe Ser Asp Met Leu Gly Ser Gly Gin 
565 570 575 



60 CAA CAT CTG GTG GAA ATC AAG GGT AAT CGC GTC ACC TGT TGG CCG AAT 17 7 6 

Gin His Leu Val Glu He Lys Gly Asn Arg Val Thr Cys Trp Pro Asn 
580 585 590 

CTA GGG CAT GGC CGT TTC GGT CAA CCA CTA ACT CTG TCA GGA TTT AGC 1824 

65 Leu Gly His Gly Arg Phe Gly Gin Pro Leu Thr Leu Ser Gly Phe Ser 
595 600 605 

CAG CCC GAA AAT AGC TTC AAT CCC GAA CGG CTG TTT CTG GCG GAT ATC 187 2 

Gin Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Ala Asp He 

70 610 615 " 620 
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:cHk gac ctt atc tat.gcg caa tcc c 
ir Thr Asp Leu He Tyr Ala Gin Ser Gly Se 



GAC GGC TCC GGC ACC^BC GAC CTT ATC TAT.GCG CAA TCC S^^TCT TTG 192 0 

Asp Gly Ser Gly Thr Thr Asp Leu He Tyr Ala Gin Ser Gly Ser Leu 
625 630 635 640 

5 CTC ATT TAT CTC AAC CAA AGT GGT AAT CAG TTT GAT GCC CCG TTG AC A 1968 
Leu He Tyr Leu Asn Gin Ser Gly Asn Gin Phe Asp Ala Pro Leu Thr 
645 650 655 

TTA GCG TTG CCA GAA GGC GTA CAA TTT GAC AAC ACT TGC CAA CTT CAA 2016 
10 Leu Ala Leu Pro Glu Gly Val Gin Phe Asp Asn Thr Cys Gin Leu Gin 
660 665 670 

GTC GCC GAT ATT CAG GGA TTA GGG ATA GCC AGC TTG ATT CTG ACT GTG 2 0 64 
Val Ala Asp He Gin Gly Leu Gly He Ala Ser Leu He Leu Thr Val 
15 675 680 685 

CCA CAT ATC GCG CCA CAT CAC TGG CGT TGT GAC CTG TCA CTG ACC AAA 2112 

Pro His He Ala Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Lys 

690 695 700 

20 

CCC TGG TTG TTG AAT GTA ATG AAC AAT AAC CGG GGC GCA CAT CAC ACG 216 0 

Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Gly Ala His His Thr 

705 710 715 720 

2 5 CTA CAT TAT CGT AGT TCC GCG CAA TTC TGG TTG GAT GAA AAA TTA CAG 22 08 
Leu His Tyr Arg Ser Ser Ala Gin Phe Trp Leu Asp Glu Lys Leu Gin 
725 730 735 

CTC ACC AAA GCA GGC AAA TCT CCG GCT TGT TAT CTG CCG TTT CCA ATG 2256 
30 Leu Thr Lys Ala Gly Lys Ser Pro Ala Cys Tyr Leu Pro Phe Pro Met 
740 745 750 

CAT TTG CTA TGG TAT ACC GAA ATT CAG GAT GAA ATC AGC GGC AAC CGG 2 3 04 
His Leu Leu Trp Tyr Thr Glu He Gin Asp Glu He Ser Gly Asn Arg 
35 755 ^ 760 765 

CTC ACC AGT GAA GTC AAC TAC AGC CAC GGC GTC TGG GAT GGT AAA GAG 2 3 52 

Leu Thr Ser Glu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys Glu 

770 775 780 

AO 

CGG GAA TTC AGA GGA TTT GGC TGC ATC AAA CAG ACA GAT ACC ACA ACG 24 00 

Arg Glu Phe Arg Gly Phe Gly Cys He Lys Gin Thr Asp Thr Thr Thr 

785 790 795 800 

4 5 TTT TCT CAC GGC ACC GCC CCC GAA CAG GCG GCA CCG TCG CTG AGT ATT 2 44 8 
Phe Ser His Gly Thr Ala Pro Glu Gin Ala Ala Pro Ser Leu Ser He 
805 810 815 

AGC TGG TTT GCC ACC GGC ATG GAT GAA GTA GAC AGC CAA TTA GCT ACG 2 4 96 
50 Ser Trp Phe Ala Thr Gly Met Asp Glu Val Asp Ser Gin Leu Ala Thr 
820 825 830 

GAA TAT TGG CAG GCA GAC ACG CAA GCT TAT AGC GGA TTT GAA ACC CGT 2 544 
Glu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Gly Phe Glu Thr Arg 
55 835 840 845 

TAT ACC GTC TGG GAT CAC ACC AAC CAG ACA GAC CAA GCA TTT ACC CCC 2 5 92 
Tyr Thr Val Trp Asp His Thr Asn Gin Thr Asp Gin Ala Phe Thr Pro 
850 *" * 855 860 



AAT GAG ACA CAA CGT AAC TGG CTG ACG CGA GCG CTT AAA GGC CAA CTG 2 64 0 
Asn Glu Thr Gin Arg Asn Trp Leu Thr Arg Ala Leu Lys Gly Gin Leu 
865 870 875 880 



65 CTA CGC ACT GAG CTC TAC GGT CTG GAC GGA ACA GAT AAG CAA ACA GTG 2 68 8 
Leu Arg Thr Glu Leu Tyr Gly Leu Asp Gly Thr Asp Lys Gin Thr Val 
885 890 895 

CCT TAT ACC GTC AGT GAA TCG CGC TAT CAG GTA CGC TCT ATT CCC GTA 27 3 6 
70 Pro Tyr Thr Val Ser Glu Ser Arg Tyr Gin Val Arg Ser He Pro Val 
900 905 910 
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AAT AAA GAA ACT GAA TTA TCT GCC TGG GTG ACT GCT ATT GAA AAT CGC 27 84 

Asn Lys Glu Thr Glu Leu Ser Ala Trp Val Thr Ala lie Glu Asn Arg 
915 920 925 

5 

AGC TAC CAC TAT GAA CGT ATC ATC ACT GAC CCA CAG TTC AGC CAG AGT 2 832 

Ser Tyr His Tyr Glu Arg lie lie Thr Asp Pro Gin Phe Ser Gin Ser 
930 935 940 

10 ATC AAG TTG CAA CAC GAT ATC TTT GGT CAA TCA CTG CAA AGT GTC GAT 28 80 
lie Lys Leu Gin His Asp He Phe Gly Gin Ser Leu Gin Ser Val Asp 
945 950 955 960 

ATT GCC TGG CCG CGC CGC GAA AAA CCA GCA GTG AAT CCC TAC CCG CCT 2 92 8 
15 He Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro Tyr Pro Pro 

965 970 975 

ACC CTG CCG GAA ACG CTA TTT GAC AGC AGC TAT GAT GAT CAA CAA CAA 2 976 
Thr Leu Pro Glu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gin Gin Gin 
20 980 985 990 

CTA TTA CGT CTQ. GTG AGA CAA AAA AAT AGC TGG CAT CAC CTG ACT GAT 3 024 
Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp His His Leu Thr Asp 
995 1000 1005 



25 



45 



65 



GGG GAA AAC TGG CGA TTA GGT TTA CCG AAT GCA CAA CGC CGT GAT GTT 3 072 
Gly Glu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp Val 
1010 1015 1020 



30 TAT ACT TAT GAC CGG AGC AAA ATT CCA ACC GAA GGG ATT TCC CTT GAA 312 0 

Tyr Thr Tyr Asp Arg Ser Lys He Pro Thr Glu Gly He Ser Leu Glu 
1025 1030 1035 1040 

ATC TTG CTG AAA GAT GAT GGC CTG CTA GCA GAT GAA AAA GCG GCC GTT 3168 

3 5 He Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val 

1045 1050 ' 1055 

TAT CTG GGA CAA CAA CAG ACG TTT TAC ACC GCC GGT CAA GCG GAA GTC 3 216 

Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Glu Val 
40 1060 1065 1070 

ACT CTA GAA AAA CCC ACG TTA CAA GCA CTG GTC GCG TTC CAA GAA ACC 3 2 64 

Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 

1075 * 1080 1085 



GCC ATG ATG GAC GAT ACC TCA TTA CAG GCG TAT GAA GGC GTG ATT GAA 3 312 
Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val lie Glu 
1090 1095 1100 



50 GAG CAA GAG TTG AAT ACC GCG CTG ACA CAG GCC GGT TAT CAG CAA GTC 3 3 60 
Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 
1105 1110 1115 1120 

GCG CGG TTG TTT AAT ACC AGA TCA GAA AGC CCG GTA TGG GCG GCA CGG 34 08 
5 5 Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 

1125 1130 1135 

CAA GGT TAT ACC GAT TAC GGT GAC GCC GCA CAG TTC TGG CGG CCT CAG 34 56 
Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 
60 1140 1145 1150 

GCT CAG CGT AAC TCG TTG CTG ACA GGG AAA ACC ACA CTG ACC TGG GAT 3 5 04 
Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 
1155 1160 1165 



ACC CAT CAT TGT GTA ATA ATA CAG ACT CAA GAT GCC GCT GGA TTA ACG 3 5 52 
Thr His His Cys Val He He Gin Thr Gin Asp Ala Ala Gly Leu Thr 
1170 1175 1180 



7 0 ACG CAA GCC CAT TAC GAT TAT CGT TTC CTT ACA CCG GTA CAA CTG ACA 3 6 00 
Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr 
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10 



30 



50 



1185 ^^.90 . 1195 1200 

GAT ATT AAT GAT AAT CAA CAT ATT GTG ACT CTG GAC GCG CTA GGT CGC 3 64 8 
Asp lie Asn Asp Asn Gin His He Val Thr Leu Asp Ala Leu Gly Arg 
1205 1210 1215 

GTA ACC ACC AGC CGG TTC TGG GGC ACA GAG GCA GGA CAA GCC GCA GGC 3 696 
Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 
1220 1225 1230 

TAT TCC AAC CAG CCC TTC ACA CCA CCG GAC TCC GTA GAT AAA GCG CTG 3 744 
Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 
1235 1240 1245 



15 GCA TTA ACC GGC GCA CTC CCT GTT GCC CAA TGT TTA GTC TAT GCC GTT 3 7 92 

Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val Tyr Ala Val 
1250 1255 1260 

GAT AGC TGG ATG CCG TCG TTA TCT TTG TCT CAG CTT TCT CAG TCA CAA 3 84 0 

2 0 Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 
1265 1270 1275 12B0 

GAA GAG GCA GAA GCG CTA TGG GCG CAA CTG CGT GCC GCT CAT ATG ATT 3 888 

Glu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met lie 
25 1285 1290 1295 

ACC GAA GAT GGG AAA GTG TGT GCG TTA AGC GGG AAA CGA GGA ACA AGC 3 93 6 

Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 

1300 1305 1310 



CAT CAG AAC CTG ACG ATT CAA CTT ATT TCG CTA TTG GCA AGT ATT CCC 3 984 
His Gin Asn Leu Thr He Gin Leu He Ser Leu Leu Ala Ser He Pro 
1315 1320 1325 



35 CGT TTA CCG CCA CAT GTA CTG GGG ATC ACC ACT GAT CGC TAT GAT AGC 4 032 

Arg Leu Pro Pro His Val Leu Gly He Thr Thr Asp Arg Tyr Asp Ser 
1330 1335 1340 

GAT CCG CAA CAG CAG CAC CAA CAG ACG GTG AGC TTT AGT GAC GGT TTT 4080 

4 0 Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe 

1345 1350 1355 1360 

GGC CGG TTA CTC CAG AGT TCA GCT CGT CAT GAG TCA GGT GAT GCC TGG 4 12 8 

Gly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 
4 5 1365 1370 1375 

CAA CGT AAA GAG GAT GGC GGG CTG GTC GTG GAT GCA AAT GGC GTT CTG 417 6 

Gin Arg Lys Glu Asp Gly Gly Leu Val Val Asp Ala Asn Gly Val Leu 
1380 1385 1390 



GTC AGT GCC CCT ACA GAC ACC CGA TGG GCC GTT TCC GGT CGC ACA GAA 4224 
Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu 
1395 1400 1405 



55 TAT GAC GAC AAA GGC CAA CCT GTG CGT ACT TAT CAA CCC TAT TTT CTA 4 2 72 
Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 
1410 1415 1420 

AAT GAC TGG CGT TAC GTT AGT GAT GAC AGC GCA CGA GAT GAC CTG TTT 4 3 20 
60 Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe 
1425 1430 1435 1440 

GCC GAT ACC CAC CTT TAT GAT CCA TTG GGA CGG GAA TAC AAA GTC ATC 4 3 68 
Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val He 
65 1445 1450 1455 

ACT GCT AAG AAA TAT TTG CGA GAA AAG CTG TAC ACC CCG TGG TTT ATT 44 16 
Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe lie 
1460 1465 1470 

70 

GTC AGT GAG GAT GAA AAC GAT ACA GCA TCA AG A ACC CCA TAG 44 5 8 
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Val Ser Glu Asp 
1475 




u Asn Asp Thr Ala Ser.Arg Thr 
1480 




1485 



5 

10 

15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 5 ammo acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32 (TcaC protein): 

Met Gin Asp Ser Pro Glu Val Ser lie Thr Thr Leu Ser Leu Pro Lys 
1 5 10 15 

Gly Gly Gly Ala He Asn Gly Met Gly Glu Ala Leu Asn Ala Ala Gly 
20 25 30 

Pro Asp Gly Met Ala Ser Leu Ser Leu Pro Leu Pro Leu Ser Thr Gly 
3 5 4 0 4 5 

Arg Gly Thr Ala Pro Gly Leu Ser Leu He Tyr Ser Asn Ser Ala Gly 
50 5 5 60 

Asn Gly Pro Phe Gly He Gly Trp Gin Cys Gly Val Met Ser He Ser 
65 70 75 60 

Arg Arg Thr Gin His Gly He Pro Gin Tyr Gly Asn Asp Asp Thr Phe 
85 90 95 

Leu Ser Pro Gin Gly Glu Val Met Asn He Ala Leu Asn Asp Gin Gly 
100 105 110 

Gin Pro Asp He Arg Gin Asp Val Lys Thr Leu Gin Gly Val Thr Leu 
115 120 125 

Pro He Ser Tyr Thr Val Thr Arg Tyr Gin Ala Arg Gin He Leu Asp 
130 135 140 

Phe Ser Lys He Glu Tyr Trp Gin Pro Ala Ser Gly Gin Glu Gly Arg 
145 150 155 160 

Ala Phe Trp Leu He Ser Thr Pro Asp Gly His Leu His He Leu Gly 
165 170 175 

Lys Thr Ala Gin Ala Cys Leu Ala Asn Pro Gin Asn Asp Gin Gin He 
180 185 190 

Ala Gin Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Val 
195 200 205 

Ser Tyr Gin Tyr Arg Ala Glu Asp Glu Ala His Cys Asp Asp Asn Glu 
210 215 220 

Lys Thr Ala His Pro Asn Val Thr Ala Gin Arg Tyr Leu Val Gin Val 
225 230 235 240 

Asn Tyr Gly Asn lie Lys Pro Gin Ala Ser Leu Phe Val Leu Asp Asn 
245 250 255 

Ala Pro Pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His 
260 265 270 

Gly Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Ala Gly 
275 280 285 

Thr Ala Gin Trp Ser Val Arg Pro Asp lie Phe Ser Arg Tyr Glu Tyr 
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290 295 300 

Gly Phe Glu Val Arg Thr Arg Arg Leu Cys Gin Gin Val Leu Met Phe 
305 310 315 320 

5 

His Arg Thr Ala Leu Met Ala Gly Glu Ala Ser Thr Asn Asp Ala Pro 
325 330 335 

Glu Leu Val Gly Arg Leu lie Leu Glu Tyr Asp Lys Asn Ala Ser Val 
10 340 345 350 

Thr Thr Leu lie Thr lie Arg Gin Leu Ser His Glu Ser Asp Gly Arg 
355 360 365 

15 Pro Val Thr Gin Pro Pro Leu Glu Leu Ala Trp Gin Arg Phe Asp Leu 
370 375 380 



20 



35 



50 



65 



Glu Lys lie Pro Thr Trp Gin Arg Phe Asp Ala Leu Asp Asn Phe Asn 
385 390 395 400 

Ser Gin Gin Arg Tyr Gin Leu Val Asp Leu Arg Gly Glu Gly Leu Pro 

405 410 415 



Gly Met Leu Tyr Gin Asp Arg Gly Ala Trp Trp Tyr Lys Ala Pro Gin 
25 420 425 430 

Arg Gin Glu Asp Gly Asp Ser Asn Ala Val Thr Tyr Asp Lys lie Ala 

435 440 445 

30 Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn Ala Ser Leu Met Asp 

450 455 460 



lie Asn Gly Asp Gly Gin Leu Asp Trp Val Val Thr Ala Ser Gly lie 

465 470 475 480 

Arg Gly Tyr His Ser Gin Gin Pro Asp Gly Lys Trp Thr His Phe Thr 

485 490 * 495 



Pro lie Asn Ala Leu Pro Val Glu Tyr Phe His Pro Ser He Gin Phe 
40 500 505 510 

Ala Asp Leu Thr Gly Ala Gly Leu Ser Asp Leu Val Leu He Gly Pro 
515 520 525 

4 5 Lys Ser Val Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 
530 " 535 540 



Glu Asp Val Pro Gin Ser Thr Gly He Thr Leu Pro Val Thr Gly Thr 

545 550 555 560 

Asp Ala Arg Lys Leu Val Ala Phe Ser Asp Met Leu Gly Ser Gly Gin 

565 570 575 



Gin His Leu Val Glu He Lys Gly Asn Arg Val Thr Cys Trp Pro Asn 
55 580 585 590 

Leu Gly His Gly Arg Phe Gly Gin Pro Leu Thr Leu Ser Gly Phe Ser 
595 600 605 

60 Gin Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Ala Asp He 
610 615 620 



Asp Gly Ser Gly Thr Thr Asp Leu He Tyr Ala Gin Ser Gly Ser Leu 

625 630 635 * 640 

Leu He Tyr Leu Asn Gin Ser Gly Asn Gin Phe Asp Ala Pro Leu Thr 

645 650 655 



Leu Ala Leu Pro Glu Gly Val Gin Phe Asp Asn Thr Cys Gin Leu Gin 
70 660 665 670 
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15 



30 



40 



45 



60 



Val Ala Asp Ile^Tn Gly Leu Gly lie Ala Ser Leu Leu Thr Val 

675 680 685 

Pro His lie Ala Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Lys 
5 690 695 700 

Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Gly Ala His His Thr 
705 710 715 ' 720 

10 Leu His Tyr Arg Ser Ser Ala Gin Phe Trp Leu Asp Glu Lvs Leu Gin 

725 730 * ' 735 



Leu Thr Lys Ala Gly Lys Ser Pro Ala Cys Tyr Leu Pro Phe Pro Met 

740 745 ' 750 

His Leu Leu Trp Tyr Thr Glu lie Gin Asp Glu lie Ser Gly Asn Arg 
755 760 765 



Leu Thr Ser Glu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys Glu 
20 770 775 780 

Arg Glu Phe Arg Gly Phe Gly Cys He Lys Gin Thr Asp Thr Thr Thr 
785 790 795 800 

2 5 Phe Ser His Gly Thr Ala Pro Glu Gin Ala Ala Pro Ser Leu Ser He 

805 810 815 



Ser Trp Phe Ala Thr Gly Met Asp Glu Val Asp Ser Gin Leu Ala Thr 
820 825 830 

Glu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Gly Phe Glu Thr Arg 
835 840 845 



Tyr Thr Val Trp Asp His Thr Asn Gin Thr Asp Gin Ala Phe Thr Pro 
35 850 855 860 



Asn Glu Thr Gin Arg Asn Trp Leu Thr Arg Ala Leu Lys Gly Gin Leu 
865 870 875 880 

Leu Arg Thr Glu Leu Tyr Gly Leu Asp Gly Thr Asp Lys Gin Thr Val 
885 890 895 

Pro Tyr Thr Val Ser Glu Ser Arg Tyr Gin Val Arg Ser He Pro Val 
900 905 910 

Asn Lys Glu Thr Glu Leu Ser Ala Trp Val Thr Ala He Glu Asn Arg 
915 920 925 



_ Ser - Tyr Hie -Xyr Glu Arg He He Thr Asp Pro Gin Phe Ser Gin Ser 

50 930 935 940 

He Lys Leu Gin His Asp He Phe Gly Gin Ser Leu Gin Ser Val Asp 

945 950 955 960 

55 He Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro Tyr Pro Pro 

965 970 975 



Thr Leu Pro Glu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gin Gin Gin 
980 985 990 

Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp His His Leu Thr Asp 
995 1000 1005 



Gly Glu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp Val 
65 1010 1015 1020 

Tyr Thr Tyr Asp Arg Ser Lys He Pro Thr Glu Gly He Ser Leu Glu 
1025 1030 1035 1040 

7 0 He Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val 

1045 1050 1055 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Glu Val 
1060 1065 1070 

Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 
1075 1080 1085 

Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val lie Glu 
1090 1095 1100 

Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 
1105 IHO 1115 H2C 

Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 
1125 " 1130 1135 

Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 
1140 1145 1150 

Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 
1155 1160 1165 

Thr His His Cys Val lie lie Gin Thr Gin Asp Ala Ala Gly Leu Thr 
1170 H75 1180 

Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr 
1185 H90 1195 120C 

Asp He Asn Asp Asn Gin His He Val Thr Leu Asp Ala Leu Gly Arg 
1205 1210 1215 

val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 
1220 1225 1230 

Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 
1235 1240 1245 

Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val Tyr Ala Val 
1250 1255 1260 

Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 
1265 * 1270 1275 128( 

Glu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met He 
12B5 1290 1295 

Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 
1300 1305 1310 

His Gin Asn Leu Thr He Gin Leu He Ser Leu Leu Ala Ser He Pro 
1315 1320 1325 

Arg Leu Pro Pro His Val Leu Gly He Thr Thr Asp Arg Tyr Asp Ser 
1330 1335 1340 

Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe 
1345 1350 1355 136' 

Gly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 
1365 1370 1375 

Gin Arg Lys Glu Asp Gly Gly Leu Val Val Asp Ala Asn Gly Val Leu 
1380 1385 1390 

Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu 
1395 1400 1405 

Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 
1410 1415 1420 

Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe 
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1425 1430 . 1435 1440 

Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val He 
1445 1450 ~ 1455 

5 

Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe He 
1460 1465 1470 

Val Ser Glu Asp Glu Asn Asp Thr Ala Ser Arg Thr Pro * 
10 1475 1480 ~ 1485 

(2) INFORMATION FOR SEQ ID NO: 33: 

15 (i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 3288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33 ( tcaA gene) : 

ATG GTG ACT GTT ATG CAA AAT AAA ATA TCA TTT TTA TCA GGT ACA TCC 4 8 
2 5 Met Val Thr Val Met Gin Asn Lys He Ser Phe Leu Ser Gly Thr Ser 
15 10 15 

GAA CAG CCC CTG CTT GAC GCC GGT TAT CAA AAC GTA TTT GAT ATC GCA 96 
Glu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val Phe Asp He Ala 
30 20 25 30 

TCA ATC AGC CGG GCT ACT TTC GTT CAA TCC GTT CCC ACC CTG CCC GTT 14 4 

Ser He Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val 
35 40 45 

35 

AAA GAG GCT CAT ACC GTC TAT CGT CAG GCG CGG CAA CGT GCG GAA AAT 192 

Lys Glu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 
50 55 60 

4 0 CTG AAA TCC CTC TAC CGA GCC TGG CAA TTG CGT CAG GAG CCG GTT ATT 24 0 
Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val He 
65 70 75 80 

AAA CGG CTG GCT AAA CTT AAC CTA CAA TCC AAC GTT TCT GTG CTT CAA 2 88 
4 5 Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin 

85 90 95 

GAT GCT TTG GTA GAG AAT ATT GGC GGT GAT GGG GAT TTC AGC GAT TTA 336 
Asp Ala Leu Val Glu Asn He Gly Gly Asp Gly Asp Phe Ser Asp Leu 
50 100 105 ~ 110 

ATG AAC CGT GCC AGT CAA TAT GCT GAC GCT GCC TCT ATT CAA TCC CTA 3 84 
Met Asn Arg Ala Ser Gin Tyr Ala Asp Ala Ala Ser He Gin Ser Leu 
115 120 125 



55 



TTT TCA CCG GGC CGT TAT GCT TCC GCA CTC TAC AGA GTT GCT AAA GAT 43 2 
Phe Ser Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 
130 135 140 



60 CTG CAT AAA TCA GAT TCC AGT TTG CAT ATT GAT AAT CGC CGC GCT GAT 480 
Leu His Lys Ser Asp Ser Ser Leu His He Asp Asn Arg Arg Ala Asp 
145 150 155 160 

CTG AAG GAT CTG ATA TTA AGC GAA ACG ACG ATG AAT AAA GAG GTC ACT 52 8 
65 Leu Lvs Asp Leu He Leu Ser Glu Thr Thr Met Asn Lys Glu Val Thr 

165 170 175 

TCC CTT GAT ATC TTG TTG GAT GTG CTA CAA AAA GGC GGT AAA GAT ATT 5 76 
Ser Leu Asp He Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp He 
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10 



50 



70 



180 185 

ACT GAG CTG TCC GGC GCA TTC TTC CCA ATG ACG TTA CCT TAT GAC GAT 624 

Thr Glu Leu Ser Gly Ala Phe Phe Pro Met Thr Leu Pro Tyr Asp Asp 
195 200 205 

CAT CTG TCG CAA ATC GAT TCC GCT TTA TCG GCA CAA GCC AGA ACG CTG 672 

His Leu Ser Gin lie Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 
210 215 220 

AAC GGT GTG TGG AAT ACT TTG AC A GAT ACC ACG GCA CAA GCG GTT TCA 72 0 

Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 

225 230 235 240 



15 GAA CAA ACC AGT AAT ACG AAT ACA CGC AAA CTG TTC GCT GCC CAA GAT 768 

Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 
245 250 255 

GGT AAT CAA GAT ACA TTT TTT TCC GGA AAC ACT TTT TAT TTC AAA GCG 816 

2 0 Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 
260 265 270 

GTG GGA TTC AGC GGG CAA CCT ATG GTT TAC CTG TCA CAG TAC ACC AGC 8 64 

Val Gly Phe Ser Gly Gin Pro Met Val Tyr Leu Ser Gin Tyr Thr Ser 
25 275 280 285 

GGG AAC GGC ATT GTC GGC GCA CAA TTG ATT GCA GGT AAT CCA GAC CAA 912 

Gly Asn Gly He Val Gly Ala Gin Leu He Ala Gly Asn Pro Asp Gin 
290 295 300 

30 

GCC GCC GCC GCA ATA GTC GCA CCG TTG AAA CTC ACT TGG TCA ATG GCA 960 

Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Met Ala 
305 310 315 320 

35 AAA CAG TGT TAC TAC CTC GTC GCT CCC GAT GGT ACA ACG ATG GGA GAC 1008 

Lys Gin Cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Met Gly Asp 
325 330 335 

GGT AAT GTT CTG ACC GGC TGT TTC TTA AGA GGC AAC AGC CCA ACT AAC 1056 

4 0 Gly Asn Val Leu Thr Gly Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn 
340 345 350 

CCG GAT AAA GAC GGT ATT TTT GCT CAG GTA GCC AAC AAA TCA GGC AGT 1104 

Pro Asp Lys Asp Gly He Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 
4 5 355 360 365 

ACT CAG CCT TTG CCA AGC TTC CAT CTG CCG GTC ACA CTG GAA CAC AGC 1152 

Thr Gin Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 
370 375 380 



GAG AAT AAA GAT CAG TAC TAT CTG AAA ACA GAG CAG GGT TAT ATC ACG 12 00 
Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Gly Tyr He Thr 
385 390 395 400 



5 5 GTA GAT AGT TCC GGA CAG TCA AAT TGG AAA AAC GCG CTG GTT ATC AAT 124 8 

Val Asp Ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val He Asn 

405 410 415 

GGG ACA AAA GAC AAG GGG CTG TTA TTA ACC TTT TGC AGC GAT AGC TCA 12 96 

60 Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 

420 425 430 

GGC ACT CCG ACA AAC CCT GAT GAT GTG ATT CCT CCC GCT ATC AAT GAT 13 44 

Gly Thr Pro Thr Asn Pro Asp Asp Val He Pro Pro Ala lie Asn Asp 

65 435 440 445 

ATT CCA TCG CCG CCA GCC CGC GAA ACA CTG TCA CTG ACG CCG GTC AGT 13 92 

lie Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 

450 455 460 



TAT CAA TTG ATG ACC AAT CCG GCA CCG ACA GAA GAT GAT ATT ACC AAC 144 0 
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Tyr Gin Leu Me^^r Asn Pro Ala Pro Thr Glu Asp lie Thr Asn 

465 470 475 " 480 

CAT TAT GGT TTT AAC GGC GCT AGC TTA CGG GCT TCT CCA TTG TCA ACC 14 88 

5 His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr 

485 490 495 

AGC GAG TTG ACC AGC AAA CTG AAT TCT ATC GAT ACT TTC TGT GAG AAG 1536 

Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr Phe Cys Glu Lys 

10 500 505 510 



15 



35 



45 



55 



60 



65 



ACC CGG TTA AGC TTC AAT CAG TTA ATG GAT TTG ACC GCT CAG CAA TCT 1584 
Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr Ala Gin Gin Ser 
515 520 525 

TAC AGT CAA AGC AGC ATT GAT GCG AAA GCA GCC AGC CGC TAT GTT CGT 1632 
Tyr Ser Gin Ser Ser He Asp Ala Lys Ala Ala Ser Arg Tyr Val Arq 
530 535 540 



2 0 TTT GGG GAA ACC ACC CCA ACC CGC GTC AAT GTC TAC GGT GCC GCT TAT 1680 

Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala Tvr 

545 550 555 560 

CTG AAC AGC ACA CTG GCA GAC GCG GCT GAT GGT CAA TAT CTG TGG ATT 1728 

2 5 Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp lie 

565 570 ' ^ 575 

CAG ACT GAT GGC AAG AGC CTA AAT TTC ACT GAC GAT ACG GTA GTC GCC 1776 

Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp Thr Val Val Ala 
30 580 585 590 



TTA GCC GGT CGC GCT GAA AAG CTG GTA CGT TTA TCA TCC CAG ACC GGG 1824 
Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 
595 600 60S 

CTA TCA TTT GAA GAA TTG GAC TGG CTG ATT GCC AAT GCC AGT CGT AGT 1872 
Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn Ala Ser Arg Ser 
610 615 620 

4 0 GTG CCG GAC CAC CAC GAC AAA ATT GTG CTG GAT AAG CCG GTC CTT GAA 192 0 
Val Pro Asp His His Asp Lys He Val Leu Asp Lys Pro Val Leu Glu 
625 630 635 640 



GCA CTG GCA GAG TAT GTC AGC CTA AAA CAG CGC TAT GGG CTT GAT GCC 1968 
Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr Gly Leu Asp Ala 
645 650 655 



AAT ACC TTT GCG ACC TTC ATT AGT GCA GTA AAT CCT TAT ACG CCA GAT 2 016 
Asn Thr Phe Ala Thr Phe He Ser Ala Val Asn Pro Tyr Thr Pro Asp 
50 660 665 670 



CAG ACA CCC AGT TTC TAT GAA ACC GCT TTC CGC TCT GCC GAC GGT AAT 2 064 

Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser Ala Asp Gly Asn 
675 680 685 

CAT GTC ATT GCG CTA GGT ACA GAG GTG AAA TAT GCA GAA AAT GAG CAG 2112 

Hxs Val He Ala Leu Gly Thr Glu Val Lys Tyr Ala Glu Asn Glu Gin 
690 695 700 

GAT GAG TTA GCC GCC ATA TGC TGC AAA GCA TTG GGT GTC ACC AGT GAT ^16 0 

Asp Glu Leu Ala Ala He Cys Cys Lys Ala Leu Gly Val Thr Ser Asp 

705 710 715 720 

GAA CTG CTC CGT ATT GGT CGC TAT TGC TTC GGT AAT GCA GGC AGT TTT 2208 

Glu Leu Leu Arg He Gly Arg Tyr Cys Phe Gly Asn Ala Gly Ser Phe 
725 730 735 



ACC TTG GAT GAA TAT ACC GCC AGT CAG TTG TAT CGC TTC GGC GCC ATT 2 25 6 
Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg Phe Gly Ala He 
fV 740 745 750 
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1 0 



20 



60 



ACA TTT GCC CAA. GCC GAA ATT T^^C 



CCC CGT TTG TTT GGG^PE ACA TTT GCC CAA. GCC GAA ATT T^^TGG CGT 2304 

Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu lie Leu Trp Arg 
755 760 765 

CTG ATG GAA GGC GGA AAA GAT ATC TTA TTG CAA CAG TTA GGT CAG GCA 2 3 52 

Leu Met Glu Gly Gly Lys Asp lie Leu Leu Gin Gin Leu Gly Gin Ala 
770 775 780 

AAA TCC CTG CAA CCA CTG GCT ATT TTA CGC CGT ACC GAG CAG GTG CTG 24 00 

Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 

785 790 795 800 



GAT TGG ATG TCG TCC GTA AAT CTA AGT CTG ACT TAT CTG CAA GGG ATG 244 8 
Asp Trp Met Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Met 
15 805 810 815 



GTA AGT ACG CAA TGG AGC GGT ACC GCC ACC GCT GAG ATG TTC AAT TTC 24 96 
Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Met Phe Asn Phe 
820 825 830 

TTG GAA AAC GTT TGT GAC AGC GTG AAT AGT CAA GCT GCC ACT AAA GAA 2544 
Leu Glu Asn Val. Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 
835 840 845 

2 5 ACA ATG GAT TCG GCG TTA CAG CAG AAA GTG CTG CGG GCG CTA AGC GCC 25 92 
Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 
850 855 860 

GGT TTC GGC ATT AAG AGC AAT GTG ATG GGT ATC GTC ACC TTC TGG CTG 2 64 0 
3C Gly Phe Gly He Lys Ser Asn Val Met Gly He Val Thr Phe Trp Leu 
865 870 875 880 

GAG AAA ATC ACA ATC GGT AGT GAT AAT CCT TTT ACA TTG GCA AAC TAC 2 6 88 
Glu Lys He Thr lie Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 
35 885 890 895 

TGG CAT GAT ATT CAA ACC CTG TTT AGC CAT GAC AAT GCC ACG TTA GAG 27 36 

Trp His Asp He Gin Thr Leu Phe Ser His Asp Asn Ala Thr Leu Glu 
900 905 910 

AO 

TCC TTA CAA ACC GAC ACT TCT CTG GTA ATT GCT ACT CAG CAA CTT AGC 2 7 84 

Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr Gin Gin Leu Ser 
915 920 925 

4 5 CAG CTA GTG TTA ATT GTG AAA TGG CTG AGC CTG ACC GAG CAG GAT CTG 2 83 2 
Gin Leu Val Leu He Val Lys Trp Leu Ser Leu Thr Glu Gin Asp Leu 
930 935 940 

CAA TTA CTG ACA ACC TAT CCC GAA CGT TTA ATC AAC GGC ATC ACG AAT 2 8 80 
50 Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn Gly He Thr Asn 
945 950 955 960 

GTT CCT GTA CCC AAT CCG GAG CTA TTA CTC ACG CTA TCA CGT TTT AAG 2 92 8 
Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu Ser Arg Phe Lys 
55 965 970 975 

CAG TGG GAA ACT CAA GTC ACC GTT TCC CGT GAT GAA GCG ATG CGC TGT 2 97 6 
Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu Ala Met Arg Cys 
980 985 990 



TTC GAT CAA TTA AAT GCC AAT GAT ATG ACG ACT GAA AAT GCA GGT TCA 3 02 4 
Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Glu Asn Ala Gly Ser 
995 1000 1005 

CTG ATC GCC ACA TTG TAT GAG ATG GAT AAA GGT ACG GGA GCG CAA GTT 3 072 
Leu He Ala Thr Leu Tyr Glu Met Asp Lys Gly Thr Gly Ala Gin Val 
1010 1015 1020 

AAT ACC TTG CTA TTA GGT GAA AAT AAC TGG CCG AAA AGT TTT ACC TCT 312 0 
Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys Ser Phe Thr Ser 
1025 1030 1035 1040 
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CTC TGG CAA CTT CTG ACC TGG TTA CGC GTC GGG CAA AGA CTG AAT GTC 3168 

Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 
1045 1050 1055 

5 

GGT AGT ACC ACT CTG GGC AAT CTG TTG TCC ATG ATG CAA GCA GAC CCT 3 216 

Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met Gin Ala Asp Pro 
1060 1065 1070 

10 GCT GCC GAG AGT AGC GCT TTA TTG GCA TCA GTA GCC CAA AAC TTA AGT 3264 

Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val Ala Gin Asn Leu Ser 
1075 1080 1085 

GCC GCA ATC AGC AAT CGT CAG TAA 32 88 

15 Ala Ala lie Ser Asn Arg Gin ••• 
1090 1095 



20 



(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1095 amino acids 

(B) TYPE: amino acids 

(C) TOPOLOGY: linear 
25 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 (TcaA protein) 
Features From To Description 
254 267 SEQ ID NO: 15 

30 254 492 TcaAii peptide 



35 



50 



65 



Met Val Thr Val Met Gin Asn Lys lie Ser Phe Leu Ser Gly Thr Ser 
1 5 10 15 

Glu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val Phe Asp lie Ala 
20 25 30 



Ser lie Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val 

40 35 40 45 

Lys Glu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 

50 55 60 

4 5 Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val lie 

€5 70 75 80 



Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin 
85 90 95 

Asp Ala Leu Val Glu Asn lie Gly Gly Asp Gly Asp Phe Ser Asp Leu 
100 105 ' 110 



Met Asn Arg Ala Ser Gin Tyr Ala Asp Ala Ala Ser He Gin Ser Leu 

55 115 120 125 

Phe Ser Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 
130 135 140 

60 Leu His Lys Ser Asp Ser Ser Leu His lie Asp Asn Arg Arg Ala Asp 

145 150 155 160 



Leu Lys Asp Leu He Leu Ser Glu Thr Thr Met Asn Lys Glu Val Thr 
165 170 175 

Ser Leu Asp He Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp He 
180 185 J 190 

Thr Glu Leu Ser Gly Ala Phe Phe Pro Met Thr Leu Pro Tyr Asp Asp 
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195 200 205 

His Leu Ser Gin He Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 
210 215 220 

5 

Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 
225 230 235 240 

Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 
10 245 250 255 

Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 
260 265 270 

15 Val Gly Phe Ser Gly Gin Pro Met Val Tyr Leu Ser Gin Tyr Thr Ser 
275 280 285 



20 



35 



50 



65 



Gly Asn Gly He Val Gly Ala Gin Leu He Ala Gly Asn Pro Asp Gin 
290 295 300 

Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Met Ala 

305 310 315 320 



Lys Gin Cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Met Gly Asp 

25 325 330 335 

Gly Asn Val Leu Thr Gly Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn 

340 345 350 

30 Pro Asp Lys Asp Gly He Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 

355 360 365 



Thr Gin Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 

370 375 380 

Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Gly Tyr He Thr 

385 390 395 400 



Val Asp Ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val He Asn 

40 " 405 410 415 

Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 
420 425 430 

4 5 Gly Thr Pro Thr Asn Pro Asp Asp Val He Pro Pro Ala lie Asn Asp 

435 440 445 



He Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 
450 455 460 

Tyr Gin Leu Met Thr Asn Pro Ala Pro Thr Glu Asp Asp lie Thr Asn 

465 470 475 480 



His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr 
55 485 490 W4 » 495 

Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr Phe Cys Glu Lys 
500 505 510 

60 Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr Ala Gin Gin Ser 
515 520 525 



Tyr Ser Gin Ser Ser He Asp Ala Lys Ala Ala Ser Arg Tyr Val Arg 

530 535 540 

Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala Tyr 

545 550 555 560 



Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp He 
70 565 570 575 
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Gin Thr Asp Gl^ 
580 




s Ser Leu Asn Phe Thr Asp Asp 
585 



Val Val 
590 



Ala 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 
595 600 605 

Leu Ser Phe Glu Glu Leu Asp Trp Leu lie Ala Asn Ala Ser Arg Ser 
610 615 620 

Val Pro Asp His His Asp Lys lie Val Leu Asp Lys Pro Val Leu Glu 
625 630 635 640 

Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr Gly Leu Asp Ala 
645 650 655 

Asn Thr Phe Ala Thr Phe lie Ser Ala Val Asn Pro Tyr Thr Pro Asp 
660 665 670 

Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser Ala Asp Gly Asn 
675 680 685 

His Val lie Ala Leu Gly Thr Glu Val Lys Tyr Ala Glu Asn Glu Gin 
690 695 700 

Asp Glu Leu Ala Ala lie Cys Cys Lys Ala Leu Gly Val Thr Ser Asp 
705 710 715 720 

Glu Leu Leu Arg lie Gly Arg Tyr Cys Phe Gly Asn Ala Gly Ser Phe 
725 730 735 

Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg Phe Gly Ala lie 
740 74S 750 

Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu lie Leu Trp Arg 
755 760 765 

Leu Met Glu Gly Gly Lys Asp He Leu Leu Gin Gin Leu Gly Gin Ala 
770 775 780 

Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 
785 790 795 800 

Asp Trp Met Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Met 
805 810 815 

Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Met Phe Asn Phe 
820 825 830 

Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 
835 840 845 

Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 
850 855 860 

Gly Phe Gly He Lys Ser Asn Val Met Gly He Val Thr Phe Trp Leu 
865 870 875 880 

Glu Lys He Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 
885 890 895 

Trp His Asp He Gin Thr Leu Phe Ser His Asp Asn Ala Thr Leu Glu 
900 905 910 

Ser Leu Gin Thr Asp Thr Ser Leu Val lie Ala Thr Gin Gin Leu Ser 
915 920 925 

Gin Leu Val Leu He Val Lys Trp Leu Ser Leu Thr Glu Gin Asp Leu 
930 935 940 

Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn Gly He Thr Asn 
945 950 955 960 
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Val Pro Val 



Pro 



Asn 
965 



Pro 



Glu Leu Leu 



Leu 
970 



Thr 



Leu Ser Arg 



Phe 
975 



Lys 



5 
10 

15 
20 
25 

30 
35 

40 
45 
50 
55 
60 
65 



Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu Ala Met Arg Cys 
980 985 990 

Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Glu Asn Ala Gly Ser 
995 1000 1005 

Leu He Ala Thr Leu Tyr Glu Met Asp Lys Gly Thr Gly Ala Gin Val 
1010 1015 1020 

Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys Ser Phe Thr Ser 
1025 1030 1035 * 1040 

Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 
1045 1050 1055 

Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met Gin Ala Asp Pro 
1060 1065 1070 

Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val Ala Gin Asn Leu Ser 
1075 1080 1085 

Ala Ala He Ser Asn Arg Gin ••• 
1090 1095 



(2) INFORMATION FOR SEQ ID NO: 35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 603 amino acids 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35 (TcaA ii:L protein) : 

Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn Ser lie Asp Thr 
15 10 15 

Phe Cys Glu Lys Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr 
20 25 30 

Ala Gin Gin Ser Tyr Ser Gin Ser Ser lie Asp Ala Lys Ala Ala Ser 
35 40 45 

Arg Tyr Val Arg Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr 
50 55 60 

Gly Ala Ala Tyr Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin 
65 70 75 80 

Tyr Leu Trp lie Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp 
85 90 95 

Thr Val Val Ala Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser 
100 105 no 

Ser Gin Thr Gly Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn 
115 120 125 

Ala Ser Arg Ser Val Pro Asp His His Asp Lys He Val Leu Asp Lys 
130 135 140 

Pro Val Leu Glu Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr 
145 150 155 " 160 
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Gly Leu Asp Al^^n Thr Phe Ala Thr Phe He Ser ^^Val Asn Pro 

165 170 175 

Tyr Thr Pro Asp Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser 
180 185 190 

Ala Asp Gly Asn His Val He Ala Leu Gly Thr Glu Val Lys Tyr Ala 
195 200 205 

Glu Asn Glu Gin Asp Glu Leu Ala Ala He Cys Cys Lys Ala Leu Gly 
210 215 220 

Val Thr Ser Asp Glu Leu Leu Arg He Gly Arg Tyr Cys Phe Gly Asn 
225 230 " 235 240 

Ala Gly Arg Phe Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg 
245 250 255 

Phe Gly Ala He Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu 
260 265 270 

He Leu Trp Arg Leu Met Glu Gly Gly Lys Asp lie Leu Leu Gin Gin 
275 280 285 

Xxx Gly Gin Ala Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr 
290 295 300 

Glu Gin Val Leu Asp Trp Met Ser Pro Val Asn Leu Ser Leu Thr Tyr 
305 310 315 320 

Leu Gin Gly Met Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu 
325 330 335 

Met Phe Asn Phe Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala 
340 345 350 

Xxx Thr Lys Glu Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg 
355 360 365 

Ala Leu Ser Ala Gly Phe Gly He Lys Ser Asn Val Met Gly He Val 
370 375 380 

Thr Phe Trp Leu Glu Lys He Thr He Gly Arg Asp Asn Pro Phe Thr 
385 390 395 * 400 

Leu Ala Asn Tyr Trp His Asp He Gin Thr Leu Phe Ser His Asp Asn 
405 410 415 

Ala Thr Leu Glu Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr 
420 425 430 

Gin Gin Leu Ser Gin Leu Val Leu He Val Lys Trp Val Ser Leu Thr 
435 440 445 

Glu Gin Asp Leu Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn 
450 455 ' 460 

Gly He Thr Asn Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu 
465 470 475 480 

Ser Arg Phe Lys Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu 
485 490 495 

Ala Met Arg Cys Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Glu 
500 505 510 

Asn Ala Gly Ser Leu He Ala Thr Leu Tyr Glu Met Asp Lys Gly Thr 
515 520 525 

Gly Ala Gin Val Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys 
530 535 J 540 
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u Trp Gin Leu Leu Thr Trp Leu Arg Val G 



Ser Phe Thr Ser Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin 
545 550 555 ~ 560 

Arg Leu Asn Val Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met 
565 570 575 

Gin Ala Asp Pro Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val Ala 
580 585 590 

Gin Asn Leu Ser Ala Ala lie Ser Asn Arg Gin * 
595 600 



15 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2557 base pairs 

(B) TYPE: nucleic acid 
2 0 (C) TOPOLOGY: linear 





(ii) 


MOLECULE 


TYPE: DNA 


(genomic) 








25 


(xi) 

fragment) : 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO: 3 6 (tcdA internal 




GAAI J.L.GGL. 1 


1 GL.G 1 1 1 AA1 


ATTGATGATG 


TCTCGCTCTT 


CCGCCTGCTT 


AAAATTACCG 


60 




A Z" 1 /"» A TT*< TV T^7\ A 

AC C ATG AT AA 


A A A 7N TP A 

TAAAGATGGA 


AAAATTAAAA 


ATAACCTAAA 


GAATCTTTCC 


AATTTATATA 


120 




• pi r*r^f* A A A A TT 


a /""T*/^ r*p&r!BT 
/\\— 1 GGL.AGA.1 


ATTCATCAAT 


TAACCATTGA 


TGAACTGGAT 


TTATTACTGA 


180 




1 1Gv_L.vj 1 AoVj 


*T*f^ AAf'fAAAA 

I urtAu^jAAAA 


ACTAATTTAT 


CCGCTATCAG 


TGATAAGCAA 


TTGGCTACCC 


240 








ATTACCAGCT 


GG CT AC AT AC 


ACAGAAGTGG 


AGTGTATTCC 


300 




2k CI C T 21 TT TH T 


r* n Tr; a r* rr* 


ACCAGCTATA 


ACAAAACGCT 


AACGCCTGAA 


ATTAAGAATT 


360 




TPPrrr ii t zi 

1 ljv_ 1 L7Vj/-i I AL- 


1L1 Al~ LKt 


GGTTTACAAG 


GTTTTGATAA 


AG ACAAAG C A 


GATTTGCTAC 


420 




ATG T(_ AT GG G 


G CCCTATATT 


GCGGCCACCT 


TGCAATTATC 


ATCGGAAAAT 


GTCGCCCACT 


4 80 




CGGTACTCCT 


TTGGG C AG AT 


AAGTTACAGC 


CCGGCGACGG 


CGCAATGACA 


GCAGAGGGAN 


540 




TCTGGGACTG 


GTTGAATACT 


AAGTATACGC 


CGGGTTCATC 


GGAAGCCGTA 


G AAACG CAGG 


600 




A A A T 1 A t* r^r^ r T K 

AAC AT A rCGT 


TC AGT ATTG T 


CAGGCTCTGG 


CACAATTGGA 


AATGGTTTAC 


CATTCCACCG 


660 




A T 1 /^ A A I"*/"* A 

bLATCAALbA 


AAACG C CTTG 


CGTCTATTTG 


TGACAAAACC 


AGAGATGTTT 


GGCGCTGCAA 


720 




CTGGAGCAGC 


GCCCGCGCAT 


GATGCCCTTT 


CACTGATTAT 


GCTGACACGT 


TTTG CGGATT 


780 




GGGTGAACGC 


ACTAGGCGAA 


AAAGCGTCCT 


CGGTGCTAGC 


GGCATTTGAA 


GCTAACTCGT 


840 


40 


TAACGGCAGA 


ACAACTGGCT 


GATGCCATGA 


ATCTTG ATG C 


TAATTTG CTG 


TTGCAAGCCA 


900 




GTATTCAAGC 


ACAAAATCAT 


CAACATCTTC 


CCCCAGTAAC 


TCCAGAAAAT 


GCGTTCTCCT 


960 




GTTGGACATC 


TATCAATACT 


ATCCTGCAAT 


GGGTTAATGT 


CG CACAACAA 


TTGAAATGTC 


1020 




GCCCCACAGG 


GCGTTTCCGC 


TTTGGTCGGG 


CTGGATTATA 


TTCAATCAAT 


GAAAGAGACA 


1080 




CCG AC CT ATG 


CCCAGTGGGA 


AAACGCGGCA 


GGCGTATTAA 


CCGCCGGGTT 


GAATTCAACA 


1140 


45 


ACAGGCTAAT 


ACATTACAAC 


GCTTTTCTGG 


ATGAATCTCG 


CAGTGCCGCA 


TTAAGCACCT 


1200 




ACTATATCCG 


TCAAGTCGCC 


AAGGCAGCGG 


CGG CTATTAA 


AAGCCGTGAT 


GACTTGTATC 


1260 




AATACTTACT 


GATTGATAAT 


CAGGTTTCTG 


CGG CAATAAA 


AACCACCCGG 


ATCGCCGAAG 


1320 




CCATTGCCAG 


TATTCAACTG 


TACGTCAACC 


GGGCATTGGA 


AAATGTGGAA 


GAAAATGCCA 


1380 




ATTCGGGGGT 


TATCAGCCGC 


CAATTCTTTA 


TCGACTGGGA 


CAAATACAAT 


AAACGCTACA 


1440 


50 


GCACTTGGGC 


GGGTGTTTCT 


CAATTAGTTT 


ACTACCCGGA 


AAACTATATT 


GATCCGACCA 


1500 




TGCGTATCGG 


ACAAACCAAA 


ATGATGGACG 


CATTACTGCA 


ATCCGTCAGC 


CAAAGCCAAT 


1560 




TAAACGCCGA 


TACCGTCGAA 


GATGCCTTTA 


TGTCTTATCT 


GACATCGTTT 


GAACAAGTGG 


1620 




CTAATCTTAA 


AGTTATTAGC 


GCATATCACG 


ATAATATTAA 


TAACGATCAA 


GGGCTGACCT 


1680 




ATTTTATCGG 


ACTCAGTGAA 


ACTGATGCCG 


GTGAATATTA 


TTGGCGCAGT 


GTCGATCACA 


1740 


55 


GTAAATTCAA 


CGACGGTAAA 


TTCGCGGCTA 


ATGCCTGGAG 


TGAATGGCAT 


AAAATTGATT 


1800 
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10 



35 



50 



GTCCAATTAA 


CCCl^fAAA 


AG C ACTATC C 


GTCCAGTGAT 


atatSBTtcc 


CGCCTGTATC 


1860 


TGCTCTGGTT 


GGAACAAAAG 


GAGATCACCA 


AACAGACAGG 


AAATAGTAAA 


GATGGCTATC 


1920 


AAACTGAAAC 


GGATTATCGT 


TATGAACTAA 


AATTGG CG C A 


TATCCGCTAT 


GATGGCACTT 


1980 


GGAATACGCC 


AATCACCTTT 


GATGTCAATA 


AAAAAATATC 


CGAGCTAAAA 


CTGGAAAAAA 


2040 


AT AG AG CG CC 


CGGACTCTAT 


TGTG CCGGTT 


ATCAAGGTGA 


AGATACGTTG 


CTGGTGATGT 


2100 


TTTATAACCA 


ACAAGACACA 


CTAGATAGTT 


ATAAAAACGC 


TTCAATGCAA 


GGACTATATA 


2160 


TCTTTG CTG A 


TATGGCATCC 


AAAGATATGA 


CCCCAGAACA 


GAGCAATGTT 


TATCGGGATA 


2220 


ATAGCTATCA 


ACAATTTGAT 


ACCAATAATG 


TCAGAAGAGT 


GAATAACCGC 


TATGCAGAGG 


2280 


ATTATGAGAT 


TCCTTCTTCG 


GTAAGTAGCC 


GTAAAGACTA 


TGGTTGGGGA 


GATTATTACC 


2340 


TCAGCATGGT 


ATATAACGGA 


GATATTCCAA 


CTATCAATTA 






"5 a r\ c\ 
2. 4 OO 


TAAAAATTTA 


TATTTCACCA 


AAATTAAGAA 


TTATTCATAA 


TGGATATGAA 


GGACAGAAGC 


2460 


GCAATCAATG 


CAATTTGATG 


AATAAATATG 


GCAAACTAGG 


TGATAAATTT 


ATTGTGTATA 


2520 


CCAGCCTGGG 


CGTTAATCCG 


AATAATAAGC 


CGAATTC 






2557 


(2) INFORMATION FOR 


SEQ ID NO: 


;37 : 









15 

(2) U 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 845 amino acids 
20 (B) TYPE: amino acids 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein (partial) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 (TcdA internal 
25 peptide) : 

Ala Phe Asn lie Asp Asp Val Ser Leu Phe Arg Leu Leu Lys lie Thr 
15 10 - 15 

30 Asp His Asp Asn Lys Asp Gly Lys lie Lys Asn Asn Leu Lys Asn Leu 
20 25 30 



Ser Asn Leu Tyr lie Gly Lys Leu Leu Ala Asp lie His Gin Leu Thr 

35 40 45 

lie Asp Glu Leu Asp Leu Leu Leu lie Ala Val Gly Glu Gly Lys Thr 

5 0 5 5 60 



Asn Leu Ser Ala lie Ser Asp Lys Gin Leu Ala Thr Leu lie Arg Lys 

40 65 70 75 80 

Leu Asn Thr lie Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val Phe 
B5 90 95 

4 5 Gin Leu Phe lie Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr Pro 

100 105 110 



Glu lie Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly Phe 

115 120 125 

Asp Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr lie Ala 

130 135 140 



Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu Leu 
55 145 150 155 160 

Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu Gly 
165 170 175 

60 Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala 
180 185 190 

Val Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala Gin 
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195 200 205 

Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe Arg 

210 215 220 

J Leu Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala Ala 
225 230 235 240 

Pro Ala His Asp Ala Leu Ser Leu He Met Leu Thr Arg Phe Ala Asp 
10 245 250 255 

Trp Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala Phe 
260 265 270 

15 Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn Leu 
275 280 285 



20 



Asp Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His Gin 
290 295 300 

His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr Ser 
305 310 315 320 

He Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Lys Cys 

25 325 330 335 

Arg Pro Thr Gly Arg Phe Arg Phe Gly Arg Ala Gly Leu Tyr Ser He 
34 0 34 5 3 50 

3 0 Asn Glu Arg Asp Thr Asp Leu Cys Pro Val Gly Lys Arg Gly Arg Arg 
355 360 365 



35 



50 



65 



He Asn Arg Arg Val Glu Phe Asn Asn Arg Leu He His Tyr Asn Ala 

370 ~ 375 380 

Phe Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg 

385 390 395 400 



Gin Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr 
40 405 410 415 

Gin Tyr Leu Leu He Asp Asn Gin Val Ser Ala Ala He Lys Thr Thr 

420 425 430 

4 5 Arg He Ala Glu Ala lie Ala Ser He Gin Leu Tyr Val Asn Arg Ala 

435 440 445 



Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin 

450 455 460 

Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 

465 470 475 480 



Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 

55 485 490 495 

Met Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 

500 505 510 

60 Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser 

515 520 525 



Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala 

530 535 540 

Tyr His Asp Asn He Asn Asn Asp Gin Gly Leu Thr Tyr Phe He Gly 

545 550 555 560 



Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 
70 565 570 575 
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Ser Lys Phe Aj 



;sp Gly 



Lys Phe Ala Ala Asn Ala 
585 



m 



Ser Glu Trp 
590 



580 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 

55 

60 

65 



His Lys He Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro 
595 600 605 

Val lie Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 
610 615 620 

He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 
625 630 635 640 

Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 
645 650 655 

Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys lie Ser Glu Leu 
660 665 670 

Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 
675 680 665 

Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 
690 695 700 

Asp Ser Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr He Phe Ala Asp 
705 710 715 720 

Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 
725 730 735 

Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 
740 745 ~ 750 

Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 
755 760 765 

Asp Tyr Gly Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp 
770 775 780 

He Pro Thr He Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys He Tyr 
785 790 795 800 

He Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lys 
805 810 ' 815 

Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys 
820 825 830 

Phe Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 (TcdA ti - pk71 internal 
peptide) : 

Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe lie Gly 
15 10 15 
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Lys 



(2) INFORMATION FOR SEQ ID NO: 39: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 9 (TcdA i:L - pK44 internal 
15 peptide) : 

Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu lie Asn Thr Ala 
15 10 15 

20 He Ser Pro Ala Lys 

20 



25 



35 



(2) INFORMATION FOR SEQ ID NO : 4 0 : 



(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40 {TcbA ii:L N-terminus) 



Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin 
15 10 



4 0 (2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

4 5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N- terminal 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41 (TcdA ii:L N-terminus): 

Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin 
15 10 

55 
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(2) INFORMATIOl^OR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 (TcdA-pk57 internal 
peptide) : 



Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr 
15 1 5 ~ 10 15 

Ala Gly Leu Glu 



20 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

2 5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N- terminal 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 (TcdA i:Li -pK20 internal 

peptide) : 



lie Arg Glu Asp Tyr Pro Ala Ser Leu Gly Lys 
35 1 5 10 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 
(B } TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

Asp Asp Ser Gly Asp Asp Asp Lys Val Thr Asn Thr Asp lie His Arg 
50 1 5 10 15 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 45: 

5 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
60 (ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

(Xi) SEQUENCE DESCRIPTION: f^Q ID NO:45: 
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15 



35 



55 



Asp Val Xaa Gly Se^^i Lys Ala Asn Glu Lys Leu Lys 

15 10 



5 (2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7551 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 i near 
(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46 ( tcdA) : 

ATG AAC GAG TCT GTA AAA GAG ATA CCT GAT GTA TTA AAA AGC CAG TGT 4 8 
Met Asn Glu Ser Val Lys Glu lie Pro Asp Val Leu Lys Ser Gin Cys 
15 10 15 



2 0 GGT TTT AAT TGT CTG ACA GAT ATT AGC CAC AGC TCT TTT AAT GAA TTT 96 

Gly Phe Asn Cys Leu Thr Asp lie Ser His Ser Ser Phe Asn Glu Phe 
20 25 30 

CGC CAG CAA GTA TCT GAG CAC CTC TCC TGG TCC GAA ACA CAC GAC TTA 144 

2 5 Arg Gin Gin Val Ser Glu His Leu Ser Trp Ser Glu Thr His Asp Leu 

35 40 45 

TAT CAT GAT GCA CAA CAG GCA CAA AAG GAT AAT CGC CTG TAT GAA GCG 192 

Tyr His Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tyr Glu Ala 

30 50 55 60 

CGT ATT CTC AAA CGC GCC AAT CCC CAA TTA CAA AAT GCG GTG CAT CTT 24 0 

Arg lie Leu Lys Arg Ala Asn Pro Gin Leu Gin Asn Ala Val His Leu 

65 70 75 80 



GCC ATT CTC GCT CCC AAT GCT GAA CTG ATA GGC TAT AAC AAT CAA TTT 288 
Ala lie Leu Ala Pro Asn Ala Glu Leu lie Gly Tyr Asn Asn Gin Phe 
8 5 90 95 



4 0 AGC GGT AGA GCC AGT CAA TAT GTT GCG CCG GGT ACC GTT TCT TCC ATG 3 36 

Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro Gly Thr Val Ser Ser Met 

100 105 110 

TTC TCC CCC GCC GCT TAT TTG ACT GAA CTT TAT CGT GAA GCA CGC AAT 38 4 

4 5 Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg Asn 

115 120 125 

TTA CAC GCA AGT GAC TCC GTT TAT TAT CTG GAT ACC CGC CGC CCA GAT 4 32 

Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg Pro Asp 

50 130 135 140 

CTC AAA TCA ATG GCG CTC AGT CAG CAA AAT ATG GAT ATA GAA TTA TCC 4 80 

Leu Lvs Ser Met Ala Leu Ser Gin Gin Asn Met Asp lie Glu Leu Ser 
145 150 155 * 160 



ACA CTC TCT TTG TCC AAT GAG CTG TTA TTG GAA AGC ATT AAA ACT GAA 52 8 
Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser lie Lys Thr Glu 
165 170 175 



60 TCT AAA CTG GAA AAC TAT ACT AAA GTG ATG GAA ATG CTC TCC ACT TTT 57 6 

Ser Lvs Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe 

180 185 190 

CGT CCT TCC GGC GCA ACG CCT TAT CAT GAT GCT TAT GAA AAT GTG CGT 624 

65 Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Ara 

195 200 205 

GAA GTT ATC CAG CTA CAA GAT CCT GGA CTT GAG CAA CTC AAT GCA TCA 67 2 

Glu Val He Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 
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10 



30 



50 



70 



210 215 220 

CCG GCA ATT GCC GGG TTG ATG CAT CAA GCC TCC CTA TTG GGT ATT AAC 7 20 
Pro Ala lie Ala Gly Leu Met His Gin Ala Ser Leu Leu Glv lie Asn 
225 230 235 240 

GCT TCA ATC TCG CCT GAG CTA TTT AAT ATT CTG ACG GAG GAG ATT ACC 7 68 
Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu He Thr 
245 250 255 

GAA GGT AAT GCT GAG GAA CTT TAT AAG AAA AAT TTT GGT AAT ATC GAA 816 
Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn He Glu 
260 265 270 



15 CCG GCC TCA TTG GCT ATG CCG GAA TAC CTT AAA CGT TAT TAT AAT TTA 8 64 

Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 

275 280 285 

AGC GAT GAA GAA CTT AGT CAG TTT ATT GGT AAA GCC AGC AAT TTT GGT 912 

20 Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 

290 295 300 

CAA CAG GAA TAT AGT AAT AAC CAA CTT ATT ACT CCG GTA GTC AAC AGC 960 

Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser 
25 305 310 315 320 

AGT GAT GGC ACG GTT AAG GTA TAT CGG ATC ACC CGC GAA TAT ACA ACC 1008 

Ser Asd Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr 
325 330 335 



AAT GCT TAT CAA ATG GAT GTG GAG CTA TTT CCC TTC GGT GGT GAG AAT 1056 
Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 
340 345 350 



35 TAT CGG TTA GAT TAT AAA TTC AAA AAT TTT TAT AAT GCC TCT TAT TTA 1104 
Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 
355 360 365 

TCC ATC AAG TTA AAT GAT AAA AGA GAA CTT GTT CGA ACT GAA GGC GCT 1152 
40 Ser He Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala 
370 375 380 

CCT CAA GTC AAT ATA GAA TAC TCC GCA AAT ATC ACA TTA AAT ACC GCT 1200 
Pro Gin Val Asn He Glu Tyr Ser Ala Asn He Thr Leu Asn Thr Ala 
4 5 385 390 395 400 

GAT ATC AGT CAA CCT TTT GAA ATT GGC CTG ACA CGA GTA CTT CCT TCC 124 0 
Asp lie Ser Gin Pro Phe Glu He Gly Leu Thr Arq Val Leu Pro Ser 
405 410 415 



GGT TCT TGG GCA TAT GCC GCC GCA AAA TTT ACC GTT GAA GAG TAT AAC 1296 
Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn 
420 425 430 



5 5 CAA TAC TCT TTT CTG CTA AAA CTT AAC AAG GCT ATT CGT CTA TCA CGT 134 4 
Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala He Arq Leu Ser Arg 
435 ' 440 445 

GCG ACA GAA TTG TCA CCC ACG ATT CTG GAA GGC ATT GTG CGC AGT GTT 1392 
60 Ala Thr Glu Leu Ser Pro Thr He Leu Glu Gly He Val Arg Ser Val 
450 455 460 

AAT CTA CAA CTG GAT ATC AAC ACA GAC GTA TTA GGT AAA GTT TTT CTG 14 4 0 
Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val Phe Leu 
65 465 470 475 480 

ACT AAA TAT TAT ATG CAG CGT TAT GCT ATT CAT GCT GAA ACT GCC CTG 14 88 
Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr Ala Leu 
4 85 4 90 4 95 



ATA CTA TGC AAC GCG CCT ATT TCA CAA CGT TCA TAT GAT AAT CAA CCT 1536 
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15 



35 



55 



lie Leu Cys Asn Al o lie Ser Gin Arg Ser Tyr Asp HVGln Pro 

500 505 

AGC CAA TTT GAT CGC CTG TTT AAT ACG CCA TTA CTG AAC GGA CAA TAT 1584 

5 Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tyr 
515 * 520 525 

TTT TCT ACC GGC GAT GAG GAG ATT GAT TTA AAT TCA GGT AGC ACC GGC 1632 

Phe Ser Thr Gly Asp Glu Glu lie Asp Leu Asn Ser Gly Ser Thr Gly 
10 530 535 540 

GAT TGG CGA AAA ACC ATA CTT AAG CGT GCA TTT AAT ATT GAT GAT GTC 1680 

Asp Trp Arg Lys Thr lie Leu Lys Arg Ala Phe Asn lie Asp Asp Val 
545 550 555 560 



TCG CTC TTC CGC CTG CTT AAA ATT ACC GAC CAT GAT AAT AAA GAT GGA 17 2 8 
Ser Leu Phe Arg Leu Leu Lys lie Thr Asp His Asp Asn Lys Asp Gly 
565 570 575 



20 AAA ATT AAA AAT AAC CTA AAG AAT CTT TCC AAT TTA TAT ATT GGA AAA 17 7 6 

Lys lie Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr lie Gly Lys 

580 585 590 

TTA CTG GCA GAT ATT CAT CAA TTA ACC ATT GAT GAA CTG GAT TTA TTA 1824 

2 5 Leu Leu Ala Asp lie His Gin Leu Thr lie Asp Glu Leu Asp Leu Leu 
5 95 600 605 

CTG ATT GCC GTA GGT GAA GGA AAA ACT AAT TTA TCC GCT ATC AGT GAT 18 7 2 

Leu lie Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala lie Ser Asp 

30 610 615 620 

AAG CAA TTG GCT ACC CTG ATC AGA AAA CTC AAT ACT ATT ACC AGC TGG 1920 

Lys Gin Leu Ala Thr Leu lie Arg Lys Leu Asn Thr lie Thr Ser Trp 

625 630 635 640 



CTA CAT ACA CAG AAG TGG AGT GTA TTC CAG CTA TTT ATC ATG ACC TCC 1968 
Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Met Thr Ser 
645 650 655 



4 0 ACC AGC TAT AAC AAA ACG CTA ACG CCT GAA ATT AAG AAT TTG CTG GAT 2016 

Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu He Lys Asn Leu Leu Asp 
660 665 670 

ACC GTC TAC CAC GGT TTA CAA GGT TTT GAT AAA GAC AAA GCA GAT TTG 2064 

4 5 Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Leu 
675 680 685 

CTA CAT GTC ATG GCG CCC TAT ATT GCG GCC ACC TTG CAA TTA TCA TCG 2112 

Leu His Val Met Ala Pro Tyr He Ala Ala Thr Leu Gin Leu Ser Ser 

50 690 695 700 

GAA AAT GTC GCC CAC TCG GTA CTC CTT TGG GCA GAT AAG TTA CAG CCC 2160 

Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gin Pro 
705 710 715 720 



GGC GAC GGC GCA ATG ACA GCA GAA AAA TTC TGG GAC TGG TTG AAT ACT 2208 
Gly Asp Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 
725 730 1 735 



60 AAG TAT ACG CCG GGT TCA TCG GAA GCC GTA GAA ACG CAG GAA CAT ATC 22 56 

Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His lie 

740 745 750 

GTT CAG TAT TGT CAG GCT CTG GCA CAA TTG GAA ATG GTT TAC CAT TCC 2 304 

65 Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tyr His Ser 

755 760 765 

ACC GGC ATC AAC GAA AAC GCC TTC CGT CTA TTT GTG ACA AAA CCA GAG 2 352 

Thr Gly He Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lvs Pro Glu 

70 770 775 730 
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ftTC TTT « A ACT - ,CA „ CCC GCG «•« CTT TCA ,00 

Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu Ser 
785 790 795 800 

5 CTG ATT ATG CTG ACA CGT TTT GCG GAT TGG GTG AAC GCA CTA GGC GAA 24 4 8 

Leu He Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu 
805 810 815 

AAA GCG TCC TCG GTG CTA GCG GCA TTT GAA GCT AAC TCG TTA ACG GCA 24 96 

10 Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala 

820 825 830 

GAA CAA CTG GCT GAT GCC ATG AAT CTT GAT GCT AAT TTG CTG TTG CAA 254 4 

Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gin 
15 835 840 845 



20 



40 



60 



GCC AGT ATT CAA GCA CAA AAT CAT CAA CAT CTT CCC CCA GTA ACT CCA 2 5 92 

Ala Ser lie Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro 
850 855 860 

GAA AAT GCG TTC TCC TGT TGG ACA TCT ATC AAT ACT ATC CTG CAA TGG 2 64 0 

Glu Asn Ala Phe Ser Cys Trp Thr Ser lie Asn Thr lie Leu Gin Trp 

865 870 875 880 



2 5 GTT AAT GTC GCA CAA CAA TTG AAT GTC GCC CCA CAC GGC GTT TCC GCT 2 688 

Val Asn Val Ala Gin Gin Leu Asn Vai Ala Pro Gin Gly Val Ser Ala 

885 890 895 

TTG GTC GGG CTG GAT TAT ATT CAA TCA ATG AAA GAG ACA CCG ACC TAT 27 36 

30 Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro Thr Tyr 

900 905 910 

GCC CAG TGG GAA AAC GCG GCA GGC GTA TTA ACC GCC GGG TTG AAT TCA 278 4 

Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 
35 915 920 925 

CAA CAG GCT AAT ACA TTA CAC GCT TTT CTG GAT GAA TCT CGC AGT CCC 2 8 32 

Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asd Glu Ser Arg Ser Ala 
930 935 940 



GCA TTA AGC ACC TAC TAT ATC CGT CAA GTC GCC AAG GCA GCG GCG GCT 2 8 80 
Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala Ala Ala 
945 950 955 960 



4 5 ATT AAA AGC CGT GAT GAC TTG TAT CAA TAC TTA CTG ATT GAT AAT CAG 2 92 8 
He Lvs Ser Arq Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp Asn Gin 
965 970 975 

GTT TCT GCG GCA ATA AAA ACC ACC CGG ATC GCC GAA GCC ATT GCC AGT 2976 
50 Val Ser Ala Ala lie Lys Thr Thr Arg He Ala Glu Ala He Ala Ser 
980 985 990 

ATT CAA CTG TAC GTC AAC CGG GCA TTG GAA AAT GTG GAA GAA AAT GCC 3024 
lie Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 
55 995 1000 1005 

AAT TCG GGG GTT ATC AGC CGC CAA TTC TTT ATC GAC TGG GAC AAA TAC 307 2 
Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp Lys Tyr 
1010 1015 1020 



AAT AAA CGC TAC AGC ACT TGG GCG GGT GTT TCT CAA TTA GTT TAC TAC 3120 
Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 
1025 1030 1035 1040 



65 CCG GAA AAC TAT ATT GAT CCG ACC ATG CGT ATC GGA CAA ACC AAA ATG 3168 
Pro Glu Asn Tyr lie Asp Pro Thr Met Arg He Gly Gin Thr Lys Met 
1045 1050 1055 

ATG GAC GCA TTA CTG CAA TCC GTC AGC CAA AGC CAA TTA AAC GCC GAT 3216 
7 0 Met Asd Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 
1060 1065 1070 
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ACC GTC GAA GAT GCC TTT ATG TCT TAT CTG ACA TCG TTT GAA CAA GTG 32 64 

Thr Val Glu Asd Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 

1075 * 1080 1085 

5 

GCT AAT CTT AAA GTT ATT AGC GCA TAT CAC GAT AAT ATT AAT AAC GAT 3312 

Ala Asn Leu Lys Val lie Ser Ala Tyr His Asp Asn He Asn Asn Asp 
1090 1095 1100 

10 CAA GGG CTG ACC TAT TTT ATC GGA CTC AGT GAA ACT GAT GCC GGT GAA 33 60 
Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala Gly Glu 
1105 1110 1115 1120 

TAT TAT TGG CGC AGT GTC GAT CAC AGT AAA TTC AAC GAC GGT AAA TTC 34 08 
15 Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 

1125 1130 1135 

GCG GCT AAT GCC TGG AGT GAA TGG CAT AAA ATT GAT TGT CCA ATT AAC 34 56 
Ala Ala Asn Ala Trp Ser Glu Trp His Lys He Asp Cys Pro lie Asn 
20 1140 1145 1150 

CCT TAT AAA AGC ACT ATC CGT CCA GTG ATA TAT AAA TCC CGC CTG TAT 3504 
Pro Tyr Lys Ser Thr He Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 
1155 1160 1165 



25 



45 



65 



CTG CTC TGG TTG GAA CAA AAG GAG ATC ACC AAA CAG ACA GGA AAT AGT 3 5 52 
Leu Leu Trp Leu Glu Gin Lys Glu lie Thr Lys Gin Thr Gly Asn Ser 
1170 1175 1180 



30 AAA GAT GGC TAT CAA ACT GAA ACG GAT TAT CGT TAT GAA CTA AAA TTG 3600 

Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 
1185 1190 1195 * 1200 

GCG CAT ATC CGC TAT GAT GGC ACT TGG AAT ACG CCA ATC ACC TTT GAT 364 8 

35 Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro lie Thr Phe Asp 

1205 1210 1215 

GTC AAT AAA AAA ATA TCC GAG CTA AAA CTG GAA AAA AAT AGA GCG CCC 3696 

Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 
40 1220 1225 1230 

GGA CTC TAT TGT GCC GGT TAT CAA GGT GAA GAT ACG TTG CTG GTG ATG 37 4 4 

Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 

1235 " 1240 " 1245 



TTT TAT AAC CAA CAA GAC ACA CTA GAT AGT TAT AAA AAC GCT TCA ATG 37 91 
Phe Tyr Asn Gin Gin Asd Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 
1250 1255 1260 



50 CAA GGA CTA TAT ATC TTT GCT GAT ATG GCA TCC AAA GAT ATG ACC CCA 38 4 0 

Gin Gly Leu Tyr lie Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 

1265 1270 1275 1280 

GAA CAG AGC AAT GTT TAT CGG GAT AAT AGC TAT CAA CAA TTT GAT ACC 38 8 8 

55 Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 

1285 1290 1295 

AAT AAT GTC AGA AGA GTG AAT AAC CGC TAT GCA GAG GAT TAT GAG ATT 3936 

Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu lie 
60 1300 1305 1310 

CCT TCC TCG GTA AGT AGC CGT AAA GAC TAT GGT TGG GGA GAT TAT TAC 3 98 4 

Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 

1315 1320 1325 



CTC AGC ATG GTA TAT AAC GGA GAT ATT CCA ACT ATC AAT TAC AAA GCC 4 032 
Leu Ser Met Val Tyr Asn Gly Asp He Pro Thr lie Asn Tyr Lys Ala 
1330 1335 1340 



7 0 GCA TCA AGT GAT TTA AAA ATC TAT ATC TCA CCA AAA TTA AGA ATT ATT 4 080 
Ala Ser Ser Asp Leu Lys lie Tyr lie Ser L'ro Lys Leu Arg lie lie 
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.0 



50 



70 



1345 1350 1355 1360 

CAT AAT GGA TAT GAA GGA CAG AAG CGC AAT CAA TGC AAT CTG ATG AAT 4128 
His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 
1365 1370 * 1375 

AAA TAT GGC AAA CTA GGT GAT AAA TTT ATT GTT TAT ACT AGC TTG GGG 4 17 6 
Lys Tyr Gly Lys Leu Gly Aso Lys Phe lie Val Tyr Thr Ser Leu Gly 
1380 1385 1390 

GTC AAT CCA AAT AAC TCG TCA AAT AAG CTC ATG TTT TAC CCC GTC TAT 4 22 4 
Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 
1395 1400 " 1405 

CAA TAT AGC GGA AAC ACC AGT GGA CTC AAT CAA GGG AGA CTA CTA TTC 4 27 2 
Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 
1410 1415 1420 

CAC CGT GAC ACC ACT TAT CCA TCT AAA GTA GAA GCT TGG ATT CCT GGA 4 320 
His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp lie Pro Gly 
1425 1430 1435 * 1440 

GCA AAA CGT TCT CTA ACC AAC CAA AAT GCC GCC ATT GGT GAT GAT TAT 4 368 
Ala Lys Arq Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp Asp Tyr 
1445 1450 1455 

GCT AC A GAC TCT CTG AAT AAA CCG GAT GAT CTT AAG CAA TAT ATC TTT 4 416 
Ala Thr Asp Scr Leu Asn Lys Pro Asp Asp Leu Lys Gin Tvr lie Phe 
1460 1465 1470 

ATG ACT GAC AGT AAA GGG ACT GCT ACT GAT GTC TCA GGC CCA GTA GAG 4 4 64 
Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Clu 
1475 1480 1485 



3 5 ATT AAT ACT GCA ATT TCT CCA GCA AAA GTT CAG ATA ATA GTC AAA GCG 4 512 
He Asn Thr Ala lie Ser Pro Ala Lys Val Gin lie He Val Lys Ala 
1490 1495 1500 

GGT GGC AAG GAG CAA ACT TTT ACC GCA GAT AAA GAT GTC TCC ATT CAG 4 560 
40 Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser He Gin 
1505 1510 1515 1520 

CCA TCA CCT AGC TTT GAT GAA ATG AAT TAT CAA TTT AAT GCC CTT GAA 4 608 
Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala Leu Glu 
45 1525 1530 1535 

ATA GAC GGT TCT GGT CTG AAT TTT ATT AAC AAC TCA GCC AGT ATT GAT 4 65 6 
He Asp Gly Ser Gly Leu Asn Phe He Asn Asn Ser Ala Ser He Asp 
1540 1545 1550 



GTT ACT TTT ACC GCA TTT GCG GAG GAT GGC CGC AAA CTG GGT TAT GAA 4704 
Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu 
1555 1560 1565 



5 5 AGT TTC AGT ATT CCT GTT ACC CTC AAG GTA AGT ACC GAT AAT GCC CTG 4 7 52 
Ser Phe Ser lie Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 
1570 1575 1580 

ACC CTG CAC CAT AAT GAA AAT GGT GCG CAA TAT ATG CAA TGG CAA TCC 4 800 
60 Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 
1585 1590 1595 1600 

TAT CGT ACC CGC CTG AAT ACT CTA TTT GCC CGC CAG TTG GTT GCA CGC 4 84 8 
Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 
65 1605 1610 1615 

GCC ACC ACC GGA ATC GAT ACA ATT CTG AGT ATG GAA ACT CAG AAT ATT 4 8 96 
Ala Thr Thr Giy He Asp Thr lie Leu Ser Met Glu Thr Gin Asn He 
1620 1625 1630 



CAG GAA CCG CAG TTA GGC AAA GGT TTC TAT GCT ACG TTC GTG ATA CCT 4 94 4 
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Gin Glu Pro Gin Leu Lys Gly Phe Tyr Ala Thr Phe V le Pre 
1635 1640 1645 

CCC TAT AAC CTA TCA ACT CAT GGT GAT GAA CGT TGG TTT AAG CTT TAT 4 992 

5 Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 

1650 1655 - 1660 

ATC AAA CAT GTT GTT GAT AAT AAT TCA CAT ATT ATC TAT TCA GGC CAG 504 0 

lie Lys His Val Val Asp Asn Asn Ser His lie lie Tyr Ser Gly Gin 
10 1665 1670 1675 1680 

CTA AC A GAT ACA AAT ATA AAC ATC ACA TTA TTT ATT CCT CTT GAT GAT 5088 

Leu Thr Asp Thr Asn lie Asn lie Thr Leu Phe lie Pro Leu Asp Asp 
1685 1690 1695 



15 



35 



55 



GTC CCA TTG AAT CAA GAT TAT CAC GCC AAG GTT TAT ATG ACC TTC AAG 5136 
Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 
1700 1705 1710 



20 AAA TCA CCA TCA GAT GGT ACC TGG TGG GGC CCT CAC TTT GTT AGA GAT 5184 

Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 
1715 1720 1725 

GAT AAA GGA ATA GTA ACA ATA AAC CCT AAA TCC ATT TTG ACC CAT TTT 5 2 32 

25 Asd Lys Gly lie Val Thr He Asn Pro Lys Ser He Leu Thr His Phe 

1730 1735 1740 

GAG AGC GTC AAT GTC CTG AAT AAT ATT AGT AGC GAA CCA ATG GAT TTC 5280 

Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met Asp Phe 
30 1745 1750 1755 1760 

AGC GGC GCT AAC AGC CTC TAT TTC TGG GAA CTG TTC TAC TAT ACC CCG 5 328 

Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 
1765 1770 1775 



ATG CTG GTT GCT CAA CGT TTG CTG CAT GAA CAG AAC TTC GAT GAA GCC 5 37 6 
Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asd Glu Ala 
1780 1785 1790 



4 0 AAC CGT TGG CTG AAA TAT GTC TGG AGT CCA TCC GGT TAT ATT GTC CAC 54 2 4 

Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Val His 
1795 1800 1805 

GGC CAG ATT CAG AAC TAC CAG TGG AAC GTC CGC CCG TTA CTG GAA GAC 54 72 

4 5 Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 
1810 1815 1820 

ACC AGT TGG AAC AGT GAT CCT TTG GAT TCC GTC GAT CCT GAC GCG GTA 5520 

Thr Sui - Ti «p I\u n Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 

50 1825 1830 1835 _ 1840 

GCA CAG CAC GAT CCA ATG CAC TAC AAA GTT TCA ACT TTT ATG CGT ACC 5568 

Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 

1845 1850 1855 



TTG GAT CTA TTG ATA GCA CGC GGC GAC CAT GCT TAT CGC CAA CTG GAA 5616 
Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 
1860 1865 1870 



60 CGA GAT ACA CTC AAC GAA GCG AAG ATG TGG TAT ATG CAA GCG CTG CAT 5 664 

Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala Leu His 

1875 1880 1885 

CTA TTA GGT GAC AAA CCT TAT CTA CCG CTG AGT ACG ACA TGG AGT GAT 5712 

65 Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Tro Ser Asn 

1890 1895 1900 

CCA CGA CTA GAC AGA GCC GCG GAT ATC ACT ACC CAA AAT GCT CAC GAC 5 7 60 

Pro Arg Leu Aso Arq Ala Ala Asp lie Thr Thr Gin Asn Ala His Aso 

70 1905 1910 1915 1920 
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AGC GCA ATA GTl^?T CTG CGG CAG AAT ATA. CCT ACA GCA CCT TTA 580G 

Ser Ala lie Val Ala Leu Arg Gin Asn lie Pro Thr Pro Ala Pro Leu 
1925 1930 1935 

5 TCA TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC 5 8 56 

Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie 
1940 1945 1950 

AAT GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT CAG AGA GTA TAC 5904 

10 Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 

1955 1960 1965 

AAT CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA 5 952 

Asn Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Tyr Leu Pro 

15 1970 1975 1980 

ATC TAT GCC ACA CCG GCC GAT CCG AAA GCG TTA CTC AGC GCC GCC GTT 6000 

lie Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 
1985 1990 1995 2000 



20 



40 



60 



GCC ACT TCT CAA GGT GGA GGC AAG CTA CCG GAA TCA TTT ATG TCC CTG 604 8 
Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 
2005 2010 2015 



2 5 TGG CGT TTC CCG CAC ATG CTG GAA AAT GCG CGC GGC ATG GTT AGC CAG 6096 

Trp Arq Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 
2020 2025 2030 

CTC ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC 614 4 

30 Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn lie lie Glu Arg Gin Asp 
2035 2040 2045 

GCG GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTC ATA 6192 

Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu lie 

35 2050 2055 2060 

TTG ACT AAC CTG AGC ATT CAG GAC AAA ACC ATT GAA GAA TTG GAT GCC 624 0 

Leu Thr Asn Leu Ser lie Gin Asp Lys Thr lie Glu Glu Leu Asp Ala 

2065 2070 2075 2080 



GAG AAA ACG GTG TTG GAA AAA TCC AAA GCG GGA GCA CAA TCG CGC TTT 6288 
Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Aia Gin Ser Arg Phe 
2085 2090 2095 



4 5 GAT AGC TAC GGC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA AAC 6 33 6 
Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn lie Asn Ala Gly Glu Asn 
2100 2105 2110 

CAA GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT 638 4 
50 Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 
2115 2120 2125 

CAG GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC 64 32 
Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn lie 
55 2130 2135 2140 

TTC GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG 64 80 
Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala lie Aid Glu Ala 
2145 2150 2155 2160 



ACA GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG 6528 
Thr Glv Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 
2165 2170 2175 



65 GAT AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG 657 6 
Asp Lys lie Ser Gin Ser Glu Thr Tvr Arg Ara Arq Arq Gin Glu Trc 
2180 2185 ' 2190 

GAG ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT 662 4 
70 Glu lie Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin lie Asp Ala 
2195 2200 2205 
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CAG CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA 567 2 

Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 
2210 2215 2220 

5 

ACC AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC 67 20 

Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 
2225 2230 2235 2240 

10 CTG CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT GGT 6768 
Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 
2245 2250 2255 

CGA CTG GCG GCG ATT TAC TTC CAG TTC TAC GAT TTG GCC GTC GCG CGT 6816 
15 Arg Leu Ala Ala lie Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 
2260 2265 2270 

TGC CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT 68 64 
Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 
20 2275 2280 2285 

GCC CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG 6912 

Ala Arg Phe lie Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 

2290 * 2295 2300 

2 5 

CTT GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT 6960 

Leu Aia Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 

2305 2310 2315 2320 

30 CAT CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC ACA GTA TCG 7 008 
His Leu Lys Aro Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 
2325 2330 2335 

CTG GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC 7 056 
35 Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 
2340 2345 2350 

7104 



7152 





CTG 


GCT 


CAG 


GAA 


ATT 


GAC 


AAG 


CTG 


GTG 


AGT 


CAA 


GGT 


TCA 


GGC 


AGT 


GCC 




Leu 


Ala 


Gin 


Glu 


He 


Asp 


Lys 


Leu 


Val 


Ser 


Gin 


Gly 


Ser 


Gly 


Ser 


Ala 


40 






2 355 








2360 








2365 








GGC 


AGT 


GGT 


AAT 


AAT 


AAT 


TTG 


GCG 


TTC 


GGC 


GCC 


GGC 


ACG 


GAC 


ACT 


AAA 




Gly 


Ser 


Gly 

> 


Asn 


Asn 


Asn 


Leu 


Ala 


Phe 


Gly 


Ala 


Gly 


Thr 


Asp 


Thr 


Lys 


45 


237C 








2375 






2380 










ACC 


TCT 


TTG 


CAG 


GCA 


TCA 


GTT 


TCA 


TTC 


GCT 


GAT 


TTG 


AAA 


ATT 


CGT 


GAA 




Thr 


Ser 


Leu 


Gin 


Ala 


Ser 


Val 


Ser 


Phe 


Ala 


Asp 


Leu 


Lys 


He 


Arg 


Glu 




2385 








2390 








2395 






2400 


50 


GAT 


TAC 


CCG 


GCA 


TCG 


CTT 


GGC 


AAA 


ATT 


CGA 


CGT 


ATC 


AAA 


CAG 


ATC 


AGC 




Asp 


Tyr 


Pro 


Ala 


Ser 


Leu 


Gly 


Lys 


He 


Arg 


Arg 


He 


Lys 


Gin 


He 


Ser 












2405 








2410 








2415 




GTC 


ACT 


TTG 


CCC 


GCG 


CTA 


CTG 


GGA 


CCG 


TAT 


CAG 


GAT 


GTA 


CAG 


GCA 


ATA 


55 


Val 


Thr 


Leu 


Pro 


Ala 


Leu 


Leu 


Gly 


Pro 


Tyr 


Gin 


Asp 


Val 


Gin 


Ala 


He 










2420 








2425 






2430 






TTG 


TCT 


TAC 


GGC 


GAT 


AAA 


GCC 


GGA 


TTA 


GCT 


AAC 


GGC 


TGT 


GAA 


GCG 


CTG 


60 


Leu 


Ser 


Tyr 


Gly 


Asp 


Lys 


Ala 


Gly 


Leu 


Ala 


Asn 


Gly 


Cys 


Glu 


Ala 


Leu 






2435 








2440 








2445 








GCA 


GTT 


TCT 


CAC 


GGT 


ATG 


AAT 


GAC 


AGC 


GGC 


CAA 


TTC 


CAG 


CTC 


GAT 


TTC 




Ala 


Val 


Ser 


His 


Gly 


Met 


Asn 


Asp 


Ser 


Gly 


Gin 


Phe 


Gin 


Leu 


Asp 


Phe 






2450 






2455 






2460 






65 




































AAC 


GAT 


GGC 


AAA 


TTC 


CTG 


CCA 


TTC 


GAA 


GGC 


ATC 


GCC 


ATT 


GAT 


CAA 


GGC 




Asn 


Asp 


Gly 


Lys 


Phe 


Leu 


Pro 


Phe 


Glu 


Gly 


He 


Ala 


He 


Asp 


Gin 


Gly 




2465 








2470 








2475 






2480 


70 


ACG 


CTG 


ACA 


CTG 


AGC 


TTC 


CCA 


AAT 


GCA 




ATG 


CCG 


GAG 


AAA 


GGT 


AAA 




Thr 


Leu 


Thr 


Leu 


Ser 


Phe 


Pro 


Asn 


Ala 


Ser 


Met 


Pro 


Glu 


Lys 


Gly 


Lys 



7200 



7248 



7296 



7344 



7392 



7440 



7488 
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r485 



2490 



2495 



CAA GCC ACT ATG TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC ~>53 6 
Gin Ala Thr Met Leu Lys Thr Leu Asn Asp lie He Leu His He Arg 
2500 2505 2510 



10 



TAC ACC ATT AAA TAA 
Tyr Thr He Lys ■•• 

2516 



7551 



(2) INFORMATION FOR SEQ ID NO: 47: 



15 



20 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2516 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 







(xi) SEQUENCE 


DESCRIPTION: SEQ ID NO:47 (TcdA) : 












Features 




From 


To 


Description 














Peptide 




1 


2516 


TcdA proteins 












Peptide 




Q Q 

o y 


iy 3 / 


TcdAn peptide 






2 5 






Fraampnt 

L -A- &l ill w i i L 




89 


100 


TcdAii N-terminus (SEQ ID NO: 13) 








Fragment 




284 


299 


(SEQ ID NO: 


38) 












Fragment 




554 


563 


(SEQ ID NO: 


17) 












Fragment 




1080 


1092 


(SEQ ID NO: 


23; 


12/13) 








t ragmen u 




1 JO J 


i a n n 


(SEQ ID NO: 


18) 






20 
~> \j 






Fragment 




1478 


1497 


{SEQ ID NO: 


39) 












Fra rrmPn t" 




1620 


1 64? 


(SEQ ID NO: 


21; 


19/23) 












1938 


-L ^ *i O 


(SEQ ID NO: 


41) 




















TcdAiii peptide 












Fragment 




2327 


2345 


(SEQ ID NO: 
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35 






Fragment 




2398 


2408 


(SEQ ID NO: 


43) 








Met 


Asn 


Glu Ser Val 


Lys 


Glu He 


Pro 


Asp Val Leu Lys 


Ser 


Gin 


Cys 




1 




5 








10 




15 


40 


Gly 


Phe 


Asn Cys Leu 


Thr 


Asp He 


Ser 


His Ser Ser Phe 


Asn 


Glu 


Phe 








20 






25 




30 








Arg 


Gin 


Gin Val Ser 


Glu 


His Leu 


Ser 


Trp Ser Glu Thr 


His 


Asp 


Leu 








35 




40 




45 








4 5 
























Tyr ~TT1*J 


"Asp Ala Gin 


Gin 


Ala Gin 


Lys 


Asp Asn Arg Leu 


Tyr 


Glu 


Ala 






50 






55 




60 










Arg 


He 


Lea Lys Arg 


Ala 


Asn Pro 


Gin 


Leu Gin Asn Ala 


Val 


His 


Leu 


50 


65 




70 






75 






80 




Ala 


He 


Leu Ala Pro 


Asn 


Ala Glu 


Leu 


He Gly Tyr Asn 


Asn 


Gin 


Phe 








85 








90 




95 




55 


Ser 


Gly 


Arg Ala Ser 


Gin 


Tyr Val 


Ala 


Pro Gly Thr Val 


Ser 


Ser 


Met 








100 




105 


11C 








Phe 


Ser 


Pro Ala Ala 


Tyr 


Leu Thr 


Glu 


Leu Tyr Arg Glu 


Ala 


Arg 


Asn 








115 




120 




125 






60 
























Leu 


His 


Ala Ser Asp 


Ser 


Val Tyr 


Tyr 


Leu Asp Thr Arq 


Arc 


Pro 


Asp 






130 






135 




140 










Leu 


Lys 


Ser Met Ala 


Leu 


Ser Gin 


Gin 


Asn Met Asp He 


Glu 


Leu 


Ser 




145 






150 






155 






160 



Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser lie Lvs Thr Glu 
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20 



35 



50 



65 



165 170 ^P!75 

Ser Lvs Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe 

180 ' 185 190 

Arq P*-o Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arg 

195 200 205 

Glu Val lie Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 

10 210 215 220 

Pro Ala lie Ala Gly Leu Met His Gin Ala Ser Leu Leu Gly lie Asn 

225 230 235 240 

15 Ala Ser lie Ser Pro Glu Leu Phe Asn lie Leu Thr Glu Glu lie Thr 

245 250 255 



Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn He Glu 

260 265 270 

Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 
275 280 285 



Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 
25 290 295 300 

Gin Gin Glu Tvr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser 
305 310 315 320 

30 Ser Asd Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr 

325 330 335 



Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly GJ y Glu Asn 

340 345 350 

Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 

35 5 360 365 



Ser lie Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala 

40 370 375 380 

Pro Gin Val Asn He Glu Tyr Ser Ala Asn lie Thr Leu Asn Thr Ala 

385 390 395 400 

4 5 Asp He Ser Gin Pro Phe Glu He Gly Leu Thr Arg Val Leu Pro Ser 

405 410 415 



Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn 

420 425 430 

Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala lie Arg Leu Ser Arc 

435 440 445 



Ala Thr Glu Leu Ser Pro Thr lie Leu Glu Gly lie Val Arg Ser Val 

55 450 455 460 

Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val Phe Leu 

465 470 475 480 

60 Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr Ala Leu 

485 490 495 



lie Leu Cys Asn Ala Pro He Ser Gin Arg Ser Tyr Asp Asn Gin Pro 

500 505 510 

Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tvr 
515 520 525 



Phe Ser Thr Glv Asp Glu Glu lie Asp Leu Asn Ser Glv Ser Thr Gly 
70 530 535 540 
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r lie Leu Lys Arg Ala Phe Asn M Asp Asp Val 

545 550 *555 560 

Ser Leu Phe Arg Leu Leu Lys lie Thr Asp His Asp Asn Lys Asp Gly 

5 565 570 575 

Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr lie Gly Lys 

580 585 590 

10 Leu Leu Ala Asp He His Gin Leu Thr He Asp Glu Leu Aso Leu Leu 

595 600 605 



15 



30 



45 



60 



Leu He Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala He Ser Aso 
610 615 620 

Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr Ser Trp 

625 630 635 640 



Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Met Thr Ser 
20 645 650 655 

Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu lie Lys Asn Leu Leu Asp 
660 665 670 

2 5 Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Leu 
675 680 685 



Leu His Val Met Ala Pro Tyr lie Ala Ala Thr Leu Gin Leu Ser Ser 

690 695 700 

Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gin Pro 

705 710 715 720 



Gly Aso Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 
35 725 730 735 

Lys Tvr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His lie 

740 745 750 

40 Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tvr His Ser 

755 760 765 



Thr Gly He Asn Glu Asn Ala Phe Arq Leu Phe Val Thr Lvs Pro Glu 

770 775 780 

Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu Ser 

785 790 795 800 



Leu He Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu 
50 805 810 815 

Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala 
820 825 830 

5 5 Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gin 
835 840 845 



Ala Ser He Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro 

850 855 860 

Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr lie Leu Gin Trp 

865 870 875 880 



Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 
65 885 890 * 895 

Leu Val Gly Leu Asp Tyr lie Gin Ser Met Lys Glu Thr Pro Thr Tyr 
900 905 910 

70 Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Scr 
915 920 925 

-226- 

SUBSTTTUTE SHEET (RULE 25) 



BNSOOCID: <WO 9808932A1 J_> 



WO 98/08932 



PCT/US97/07657 



10 



AO 



55 



70 



Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg Ser Ala 
930 935 940 

Ala Leu Ser Thr Tyr Tyr lie Arg Gin Val Ala Lys Ala Ala Ala Ala 
945 950 955 960 

lie Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu lie Asp Asn Gin 
965 970 975 

Val Ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He Ala Ser 
980 985 990 



He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 

1 5 995 1000 1005 

Asn Ser Gly Val He Ser Arg Gin Phe Phe lie Asp Trp Asp Lys Tyr 
1010 1015 1020 

2 0 Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 

1025 1030 1035 1040 

Pro Glu Asn Tyr He Asp Pro Thr Met Arg lie Gly Gin Thr Lys Met 
1045 1050 1055 

2 5 

Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 
1060 1065 1070 

Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 
30 1075 1080 1085 

Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn lie Asn Asn Asp 
1090 1095 1100 

35 Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala Gly Glu 
1105 1110 1115 1120 



Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 
1125 1130 1135 

Ala Ala Asn Ala Trp Ser Glu Trp His Lys lie Asp Cys Pro lie Asn 
1140 1145 1150 



Pro Tyr Lys Ser Thr He Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 
4 5 1155 1160 1165 

Leu Leu Trp Leu Glu Gin Lys Glu lie Thr Lys Gin Thr Glv Asn Ser 
1170 1175 1180 

50 Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 

1185 1190 1195 1200 



Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro lie Thr Phe Asp 
1205 1210 1215 

Val Asn Lys Lys lie Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 
1220 1225 1230 



Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 
60 1235 1240 1245 

Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 
1250 1255 1260 

65 Gin Gly Leu Tyr lie Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 
1265 1270 1275 1280 



Glu Gin Ser Asn Val Tyr Arg Asp Asn Scr Tyr Gin Gin Phe Asp Thr 
1285 1290 1295 

Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu He 
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20 



35 



50 



65 



1301^ 1305 1310 

Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 
1315 1320 1325 

5 

Leu Ser Met Val Tvr Asn Gly Asp lie Pro Thr lie Asn Tyr Lvs Ala 
1330 1335 1340 

Ala Ser Ser Asp Leu Lys lie Tyr lie Ser Pro Lys Leu Arg lie lie 
10 1345 1350 1355 1360 

His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 
1365 1370 1375 

15 Lys Tyr Gly Lys Leu Gly Asp Lys Phe lie Val Tyr Thr Ser Leu Gly 
1380 1385 1390 



Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 
1395 1400 1405 

Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 
1410 1415 1420 



His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp lie Pro Glv 
25 1425 1430 1435 1440 

Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Aia lie Glv Asp Asp Tyr 
1445 1450 1455 

30 Ala Thr Asp Ser Leu Asn Lys Pro Asd Asp Leu Lys Gin Tyr lie Phe 
1460 1465 1470 



Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 
1475 1480 1485 

lie Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val Lys Ala 
1490 1495 1500 



Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser He Gin 
40 1505 1510 1515 1520 

Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala Leu Glu 
1525 1530 1535 

4 5 He Asd Gly Ser Gly Leu Asn Phe He Asn Asn Ser Ala Ser He Asd 
1540 1545 1550 



Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tvr Glu 
1555 1560 1565 

Ser Phe Ser He Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 

1570 1575 1580 



Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 
55 1585 1590 1595 1600 

Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arq Gin Leu Val Ala Arg 
1605 1610 1615 

60 Ala Thr Thr Gly He Asp Thr He Leu Ser Met Glu Thr Gin Asn lie 

1620 1625 1630 



Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val He Pro 
1635 1640 1645 

Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 

1650 1655 1660 



lie Lvs His Val Val Asp Asn Asn Sor His He He Tvr Ser Gly Gin 
70 1665 1670 1675 1680 
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Leu Thr Asp Thr 'Asr^J^e Asn lie Thr Leu Phe lie Pro ^^^Asp Asd 

1685 1690 1695 

Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 

5 1700 1705 1710 

Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asd 
1715 1720 1725 

10 Asp Lys Gly lie Val Thr lie Asn Pro Lys Ser He Leu Thr His Phe 
1730 1735 1740 



15 



30 



45 



60 



Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met Asp Phe 
2745 1750 1755 1760 

Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 
1765 1770 "* 1775 



Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 
20 1780 1785 1790 

Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Val His 
1795 1800 1805 

25 Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 
1810 1815 1820 



Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 
1825 1830 1835 1840 

Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 

1845 1850 1855 



Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 
35 I860 1865 1870 

Arg Asp Thr Leu Asn Glu Ala Lvs Met Trp Tyr Met Gin Ala Leu His 
1875 1880 1885 

4 0 Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 

1890 1895 1900 



Pro Arg Leu Asp Arg Ala Ala Asp He Thr Thr Gin Asn Ala His Asp 
1905 1910 1915 1920 

Ser Ala He Val Ala Leu Arg Gin Asn lie Pro Thr Pro Ala Pro Leu 
1925 1930 1935 



Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie 
50 1940 1945 1950 

Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 
1955 1960 1965 

55 Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro 

1970 1975 1980 



He Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 
1985 1990 1995 2000 

Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 
2005 2010 2015 



Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 
65 2020 2025 2030 

Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He lie Glu Arg Gin Asp 
2035 2040 2045 

70 Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He 
2050 2055 2060 
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Leu Thr 
2065 



Asn 



Leu Ser lie Gin 
2070 



Asp Lys Thr He Glu 
2075 



Glu Leu Asp Ala 
2080 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



Glu Lvs Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 
2085 2090 2095 

Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn 
2100 * 2105 2110 

Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 
2115 2120 2125 

Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 
2130 2135 2140 

Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 
2145 2150 2155 2160 

Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 
2165 2170 2175 

Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 
2180 2185 2190 

Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala 
2195 2200 ' 2205 

Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 
2210 2215 2220 

Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 
2225 2230 2235 224C 

Leu Gin Arq Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 
2245 2250 2255 

Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 
2260 2265 " 2270 

Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 
2275 2280 2285 

Ala Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 
2290 2295 2300 

Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 
2305 2310 2315 232( 

His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 
2325 2330 2335 

Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 
2340 2345 * 2350 

Leu Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 
2355 2360 2365 

Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 
2370 2375 2380 

Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys lie Arg Glu 
2385 2390 2395 240i 

Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arq lie Lys Gin lie Ser 
2405 2410 ' 2415 

Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He 
2420 2425 2430 

Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 
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2435 2440 . 2445 

Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 
2450 2455 2460 

5 

Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly lie Ala lie Asp Gin Gly 
2465 2470 2475 2480 

Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys 
10 2485 2490 2495 

Gin Ala Thr Met Leu Lys Thr Leu Asn Asp lie lie Leu His lie Arg 
2500 2505 2510 

15 Tyr Thr lie Lys 
2516 



20 



30 



50 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5547 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 {tcdAn coding 
region) : 

CTG ATA GGC TAT AAC AAT CAA TTT AGC GGT AGA GCC AGT CAA TAT GTT 4 8 
Leu lie Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 
15 10 15 



35 GCG CCG GGT ACC GTT TCT TCC ATG TTC TCC CCC GCC GCT TAT TTG ACT 96 
Ala Pro Gly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr 
20 25 30 

GAA CTT TAT CGT GAA GCA CGC AAT TTA CAC GCA AGT GAC TCC GTT TAT 14 4 
4 0 Glu Leu Tyr Arg Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr 
35 40 45 

TAT CTG GAT ACC CGC CGC CCA GAT CTC AAA TCA ATG GCG CTC AGT CAG 192 
Tyr Leu Asp Thr Arq Arq Pro Asp Leu Lys Ser Met Ala Leu Ser Gin 
4 5 50 55 60 

CAA AAT ATG GAT ATA GAA TTA TCC ACA CTC TCT TTG TCC AAT GAG CTG 24 0 
Gin Asn Met Asd lie Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu 
65 70 75 80 



TTA TTG GAA AGC ATT AAA ACT GAA TCT AAA CTG GAA AAC TAT ACT AAA 28 8 
Leu Leu Glu Ser He Lys Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys 
85 90 95 



55 GTG ATG GAA ATG CTC TCC ACT TTC CGT CCT TCC GGC GCA ACG CCT TAT 33 6 

Val Met Glu Met Leu Ser Thr Phe Arg Pro Ser Gly Ala Thr Pro Tyr 
100 105 110 

CAT GAT GCT TAT GAA AAT GTG CGT GAA GTT ATC CAG CTA CAA GAT CCT 38 4 

60 His Asp Ala Tyr Glu Asn Val Arg Glu Val He Gin Leu Gin Asp Pro 
115 120 125 

GGA CTT GAG CAA CTC AAT GCA TCA CCG GCA ATT GCC GGG TTG ATG CAT 4 32 

Gly Leu Glu Gin Leu Asn Ala Ser Pro Ala He Ala Gly Leu Met His 

65 130 135 140 

CAA GCC TCC CTA TTG GGT ATT AAC GCT TCA ATC TCG CCT GAG CTA TTT 4 80 

Gin Ala Ser Leu Leu Gly He Asn Ala Ser lie Ser Pro Glu Leu Phe 
145 150 155 160 
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AAT ATT CTG ACG GAG GAG ATT ACC GAA GGT AAT GCT GAG GAA CTT TAT 528 

Asn lie Leu Thr Glu Glu lie Thr Glu Gly Asn Ala Glu Glu Leu Tyr 

165 170 175 

5 

AAG AAA AAT TTT GGT AAT ATC GAA CCG GCC TCA TTG GCT ATG CCG GAA 57 6 

Lys Lys Asn Phe Gly Asn lie Glu Pro Ala Ser Leu Ala Met Pro Glu 

180 185 190 

10 TAC CTT AAA CGT TAT TAT AAT TTA AGC GAT GAA GAA CTT AGT CAG TTT 624 
Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Scr Gin Phe 
195 200 205 

ATT GGT AAA GCC AGC AAT TTT GGT CAA CAG GAA TAT AGT AAT AAC CAA 67 2 
15 lie Gly Lys Ala Ser Asn Phe Gly Gin Gin Glu Tyr Ser Asn Asn Gin 
210 215 220 

CTT ATT ACT CCG GTA GTC AAC AGC AGT GAT GGC ACG GTT AAG GTA TAT 7 20 
Leu lie Thr Pro Val Val Asn Ser Ser Asp Gly Thr Val Lys Val Tyr 
20 225 230 235 ' 240 

CGG ATC ACC CGC GAA TAT ACA ACC AAT GCT TAT CAA ATG GAT GTG GAG 7 68 
Arg lie Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Met Asp Val Glu 
245 250 255 



25 



45 



65 



CTA TTT CCC TTC GGT GGT GAG AAT TAT CGG TTA GAT TAT AAA TTC AAA 816 
Leu Phe Pro Phe Gly Gly Glu Asn Tyr Arq Leu Asd Tyr Lvs Phe Lys 

260 265 270 



30 AAT TTT TAT AAT GCC TCT TAT TTA TCC ATC AAG TTA AAT GAT AAA AG A 864 

Asn Phe Tyr Asn Ala Ser Tyr Leu Ser lie Lys Leu Asn Asp Lys Arg 
275 280 285 

GAA CTT GTT CGA ACT GAA GGC GCT CCT CAA GTC AAT ATA GAA TAC TCC 912 

35 Glu Leu Val Arg Thr Glu Gly Ala Pro Gin Val Asn lie Glu Tyr Ser 
290 295 300 

GCA AAT ATC ACA TTA AAT ACC GCT GAT ATC AGT CAA CCT TTT GAA ATT 960 

Ala Asn lie Thr Leu Asn Thr Ala Asp lie Ser Gin Pro Phe Glu lie 
40 305 310 315 320 

GGC CTG ACA CGA GTA CTT CCT TCC GGT TCT TGG GCA TAT GCC GCC GCA 1008 

Gly Leu Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala 

325 330 335 



AAA TTT ACC GTT GAA GAG TAT AAC CAA TAC TCT TTT CTG CTA AAA CTT 1056 
Lys Phe Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu 
340 345 350 



50 AAC AAG GCT ATT CGT CTA TCA CGT GCG ACA GAA TTG TCA CCC ACG ATT 1104 
Asn Lys Ala He Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr He 
355 360 365 

CTG GAA GGC ATT GTG CGC AGT GTT AAT CTA CAA CTG GAT ATC AAC ACA 1152 
55 Leu Glu Gly He Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr 
370 375 380 

GAC GTA TTA GGT AAA GTT TTT CTG ACT AAA TAT TAT ATG CAG CGT TAT 1200 
Asd Val Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr 
60 385 390 395 400 

GCT ATT CAT GCT GAA ACT GCC CTG ATA CTA TGC AAC GCG CCT ATT TCA 124 8 
Ala lie His Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro He Ser 
405 410 415 



CAA CGT TCA TAT GAT AAT CAA CCT AGC CAA TTT GAT CGC CTG TTT AAT 12 96 
Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn 
420 425 430 



7 0 ACG CCA TTA CTG AAC GGA CAA TAT TTT TCT ACC GGC GAT GAG GAG ATT 134 4 
Thr Pro Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He 
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10 



30 



50 



70 



435 440 445 

GAT TTA AAT TCA GGT AGC ACC GGC GAT TGG CGA AAA ACC ATA CTT AAG 1392 

Asp Leu Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr lie Leu Lys 
450 455 460 

CGT GCA TTT AAT ATT GAT GAT GTC TCG CTC TTC CGC CTG CTT AAA ATT 14 40 

Arg Ala Phe Asn lie Asp Asd Val Ser Leu Phe Arg Leu Leu Lys lie 

465 470 475 480 

ACC GAC CAT GAT AAT AAA GAT GGA AAA ATT AAA AAT AAC CTA AAG AAT 14 88 

Thr Asd His Asp Asn Lys Asp Gly Lys lie Lys Asn Asn Leu Lys Asn 
485 490 495 



15 CTT TCC AAT TTA TAT ATT GGA AAA TTA CTG GCA GAT ATT CAT CAA TTA 1536 

Leu Ser Asn Leu Tyr lie Gly Lys Leu Leu Ala Asp lie His Gin Leu 
500 505 510 

ACC ATT GAT GAA CTG GAT TTA TTA CTG ATT GCC GTA GGT GAA GGA AAA 1584 

20 Thr lie Asp Glu Leu Asp Leu Leu Leu lie Ala Val Gly Glu Gly Lys 
515 520 525 

ACT AAT TTA TCC GCT ATC AGT GAT AAG CAA TTG GCT ACC CTG ATC AGA 1632 

Thr Asn Leu Ser Ala lie Ser Asp Lys Gin Leu Ala Thr Leu lie Arg 
25 530 535 540 

AAA CTC AAT ACT ATT ACC AGC TGG CTA CAT ACA CAG AAG TGG AGT GTA 168 0 

Lys Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val 

545 550 555 560 



TTC CAG CTA TTT ATC ATG ACC TCC ACC AGC TAT AAC AAA ACG CTA ACG 17 28 
Phe Gin Leu Phe He Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr 
565 570 575 



35 CCT GAA ATT AAG AAT TTG CTG GAT ACC GTC TAC CAC GGT TTA CAA GGT 177 6 

Pro Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly 

580 585 590 

TTT GAT AAA GAC AAA GCA GAT TTG CTA CAT GTC ATG GCG CCC TAT ATT 1824 

40 Phe Asp Lys Asd Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr He 
595 600 605 

GCG GCC ACC TTG CAA TTA TCA TCG GAA AAT GTC GCC CAC TCG GTA CTC 1872 

Ala Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu 
45 610 615 620 

CTT TGG GCA GAT AAG TTA CAG CCC GGC GAC GGC GCA ATG ACA GCA GAA 1920 

Leu Ttd Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu 

625 630 635 640 



AAA TTC TGG GAC TGG TTG AAT ACT AAG TAT ACG CCG GGT TCA ' TCG" GAA 1968 
Lys Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu 
645 650 655 



55 GCC GTA GAA ACG CAG GAA CAT ATC GTT CAG TAT TGT CAG GCT CTG GCA 2016 
Ala Val Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala 
660 665 670 

CAA TTG GAA ATG GTT TAC CAT TCC ACC GGC ATC AAC GAA AAC GCC TTC 2064 
60 Gin Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe 
675 680 685 

CGT CTA TTT GTG ACA AAA CCA GAG ATG TTT GGC GCT GCA ACT GGA GCA 2112 
Arg Leu Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala 
65 690 695 700 

GCG CCC GCG CAT GAT GCC CTT TCA CTG ATT ATG CTG ACA CGT TTT GCG 2160 
Ala Pro Ala His Asd Ala Leu Ser Leu He Met Leu Thr Arg Phe Ala 
705 ' 710 715 720 



GAT TGG GTG AAC GCA CTA GGC GAA AAA GCG TCC TCG GTG CTA GCG GCA 2208 
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Asp Trp Val AslWfLa Leu Gly Glu Lys Ala. Ser Ser Val Leu Ala Ala 

725 730 735 

TTT GAA GCT AAC TCG TTA ACG GCA GAA CAA CTG GCT GAT GCC ATG AAT 2256 

5 Phe Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn 
740 745 750 

CTT GAT GCT AAT TTG CTG TTG CAA GCC AGT ATT CAA GCA CAA AAT CAT 2304 

Leu Asp Ala Asn Leu Leu Leu Gin Ala Ser lie Gin Ala Gin Asn His 
10 755 760 765 



15 



35 



55 



CAA CAT CTT CCC CCA GTA ACT CCA GAA AAT GCG TTC TCC TGT TGG ACA 2 352 

Gin His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr 

770 775 780 

TCT ATC AAT ACT ATC CTG CAA TGG GTT AAT GTC GCA CAA CAA TTG AAT 2 4 00 

Ser lie Asn Thr lie Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 

785 790 795 800 



20 GTC GCC CCA CAG GGC GTT TCC GCT TTG GTC GGG CTG GAT TAT ATT CAA 2 44 8 

Val Ala Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr He Gin 

805 810 815 

TCA ATG AAA GAG ACA CCG ACC TAT GCC CAG TGG GAA AAC GCG GCA GGC 24 96 

25 Ser Met Lys Glu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly 

820 825 830 

GTA TTA ACC GCC GGG TTG AAT TCA CAA CAG GCT AAT ACA TTA CAC GCT 2 54 4 

Val Leu Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Ala 

30 835 840 845 

TTT CTG GAT GAA TCT CGC AGT GCC GCA TTA AGC ACC TAC TAT ATC CGT 2592 

Phe Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg 

850 855 860 



CAA GTC GCC AAG GCA GCG GCG GCT ATT AAA AGC CGT GAT GAC TTG TAT 2 64 0 
Gin Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr 
865 870 875 880 



4 0 CAA TAC TTA CTG ATT GAT AAT CAG GTT TCT GCG GCA ATA AAA ACC ACC 2 68 8 
Gin Tyr Leu Leu He Asp Asn Gin Val Ser Ala Ala He Lys Thr Thr 
865 890 895 

CGG ATC GCC GAA GCC ATT GCC AGT ATT CAA CTG TAC GTC AAC CGG GCA 27 36 
4 5 Arg He Ala Glu Ala He Ala Ser He Gin Leu Tyr Val Asn Arq Ala 
900 905 910 

TTG GAA AAT GTG GAA GAA AAT GCC AAT TCG GGG GTT ATC AGC CGC CAA 27 8 4 
Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin 
50 915 920 925 

TTC TTT ATC GAC TGG GAC AAA TAC AAT AAA CGC TAC AGC ACT TGG GCG 28 32 
Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 
930 935 940 



GGT GTT TCT CAA TTA GTT TAC TAC CCG GAA AAC TAT ATT GAT CCG ACC 2880 
Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 
945 950 955 960 



60 ATG CGT ATC GGA CAA ACC AAA ATG ATG GAC GCA TTA CTG CAA TCC GTC 2 928 
Met Arg lie Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 
965 970 975 

AGC CAA AGC CAA TTA AAC GCC GAT ACC GTC GAA GAT GCC TTT ATG TCT 2 97 6 
65 Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser 
980 985 990 

TAT CTG ACA TCG TTT GAA CAA GTG GCT AAT CTT AAA GTT ATT AGC GCA 302 4 
Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val lie Ser Ala 
70 ' 995 1000 1005 
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20 



60 



TAT CAC GAT AAT ATT AAC GAT CAA GGG .CTG ACC TAT T^TC GGA 307 2 

Tyr His Asp Asn lie Asn Asn Asp Gin Gly Leu Thr Tyr Phe lie Gly 
1010 1015 1020 

5 CTC AGT GAA ACT GAT GCC GGT GAA TAT TAT TGG CGC AGT GTC GAT CAC 2120 

Leu Ser Glu Thr Aso Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 

1025 * 1030 1035 1040 

AGT AAA TTC AAC GAC GGT AAA TTC GCG GCT AAT GCC TGG AGT GAA TGG 3168 

10 Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 

1045 1050 1055 

CAT AAA ATT GAT TGT CCA ATT AAC CCT TAT AAA AGC ACT ATC CGT CCA 3216 

His Lys He Asp Cvs Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro 

15 1060 1065 1070 

GTG ATA TAT AAA TCC CGC CTG TAT CTG CTC TGG TTG GAA CAA AAG GAG 32 64 

Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 

1075 1080 1085 



ATC ACC AAA CAG ACA GGA AAT AGT AAA GAT GGC TAT CAA ACT GAA ACG 3312 
He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 
1090 1095 1100 



2 5 GAT TAT CGT TAT GAA CTA AAA TTG GCG CAT ATC CGC TAT GAT GGC ACT 3 3 60 
Aso Tvr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 
1105 * 1110 1115 112C 

TGG AAT ACG CCA ATC ACC TTT GAT GTC AAT AAA AAA ATA TCC GAG CTA 3 4 08 
30 Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu 

1125 1130 1135 

AAA CTG GAA AAA AAT AGA GCG CCC GGA CTC TAT TGT GCC GGT TAT CAA 3 4 56 
Lys Leu Glu Lvs Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 
35 1140 1145 ~ 1150 

GGT GAA GAT ACG TTG CTG GTG ATG TTT TAT AAC CAA CAA GAC ACA CTA 350 4 

Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 
1155 1160 1165 

40 

GAT AGT TAT AAA AAC GCT TCA ATG CAA GGA CTA TAT ATC TTT GCT GAT 3 5 52 

Asp Ser Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr He Phe Ala Asp 

1170 "* 1175 1180 

4 5 ATG GCA TCC AAA GAT ATG ACC CCA GAA CAG AGC AAT GTT TAT CGG GAT 2 600 
Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 
1185 1190 1195 120C 

AAT AGC TAT CAA CAA TTT GAT ACC AAT AAT GTC AGA AGA GTG AAT AAC 3 64 8 
50 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 

1205 1210 1215 

CGC TAT GCA GAG GAT TAT GAG ATT CCT TCC TCG GTA AGT AGC CGT AAA 3 6 96 
Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 
55 1220 1225 1230 

GAC TAT GGT TGG GGA GAT TAT TAC CTC AGC ATG GTA TAT AAC GGA GAT 37 44 
Asp Tyr Gly Trrj Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp 
1235 1240 1245 



ATT CCA ACT ATC AAT TAC AAA GCC GCA TCA AGT GAT TTA AAA ATC TAT 2 7 92 
He Pro Thr He Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys He Tyr 
1250 1255 1260 



65 ATC TCA CCA AAA TTA AGA ATT ATT CAT AAT GGA TAT GAA GGA CAG AAG 28 4 0 

lie Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lvs 

1265 1270 1275 128C 

CGC AAT CAA TGC AAT CTG ATG AAT AAA TAT GGC AAA CTA GGT GAT AAA 3 3 88 

7 0 Ara Asn Gin Cvs Asn Leu Met Asn Lys Tyr Gly Lys Leu Glv Asp Lys 

1285 1290 1295 
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TTT ATT GTT TAT ACT AGC TTG GGG GTC AAT'CCA AAT AAC TCG TCA AAT 3936 

Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 

1300 1305 1310 

AAG CTC ATG TTT TAC CCC GTC TAT CAA TAT- AGC GGA AAC ACC AGT GGA 398 4 

Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 

1315 1320 1325 

10 CTC AAT CAA GGG AGA CTA CTA TTC CAC CGT GAC ACC ACT TAT CCA TCT 4 032 

Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 
1330 1335 ' 1340 

AAA GTA GAA GCT TGG ATT CCT GGA GCA AAA CGT TCT CTA ACC AAC CAA 4 080 

15 Lys Val Glu Ala Trp He Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 
1345 1350 1355 1360 

AAT GCC GCC ATT GGT GAT GAT TAT GCT ACA GAC TCT CTG AAT AAA CCG 4128 

Asn Ala Ala He Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 
20 1365 1370 1375 



25 



GAT GAT CTT AAG CAA TAT ATC TTT ATG ACT GAC AGT AAA GGG ACT CCT 4176 

Asp Asp Leu Lys Gin Tyr He Phe Met Thr Asp Scr Lys Glv Thr Ala 

1380 1385 1390 

ACT GAT GTC TCA GGC CCA GTA GAG ATT AAT ACT GCA ATT TCT CCA GCA 4 2 24 

Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala lie Ser Pro Ala 
1395 1400 1405 

30 AAA GTT CAG ATA ATA GTC AAA GCG GGT GGC AAG GAG CAA ACT TTT ACC 4 27 2 

Lys Val Gin lie lie Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 
1410 1415 1420 

GCA GAT AAA GAT GTC TCC ATT CAG CCA TCA CCT AGC TTT GAT GAA ATG 4 320 

35 Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asd Glu Met 

1425 1430 1435 1440 

AAT TAT CAA TTT AAT GCC CTT GAA ATA GAC GGT TCT GGT CTG AAT TTT 4 368 

Asn Tyr Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe 

40 1445 1450 1455 



45 



ATT AAC AAC TCA GCC AGT ATT GAT GTT ACT TTT ACC GCA TTT GCG GAG 4 4 16 

lie Asn Asn Ser Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu 

1460 1465 1470 

GAT GGC CGC AAA CTG GGT TAT GAA AGT TTC AGT ATT CCT GTT ACC CTC 4 4 64 

Asp Gly Arg Lys Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu 

_ 1475 1480 1485 

50 AAG GTA AGT ACC GAT AAT GCC CTG ACC CTG CAC CAT AAT GAA AAT GGT 4 512 

Lys Val Ser Thr Asp Asn Ala Leu Thr Leu Has His Asn Glu" ttrrn Glv 
1490 1495 1500 

GCG CAA TAT ATG CAA TGG CAA TCC TAT CGT ACC CGC CTG AAT ACT CTA 4 5 60 

Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arq Leu Asn Thr Leu 

1505 1510 1515 1520 

TTT GCC CGC CAG TTG GTT GCA CGC GCC ACC ACC GGA ATC GAT ACA ATT 4 608 

Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He 

60 1525 1530 1535 



65 



70 



CTG AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAG TTA GGC AAA GG^ 4 656 
Leu Ser Met Glu Thr Gin Asn lie Gin Glu Pro Gin Leu Gly Lys Glv 
1540 1545 1550 

TTC TAT GCT ACG TTC GTG ATA CCT CCC TAT AAC CTA TCA ACT CAT GGT 4 7 04 
Phe Tyr Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly 
1555 1560 1565 

GAT GAA CGT TGG TTT AAG CTT TAT ATC AAA CAT GTT GTT GAT AAT AAT 4 75*? 
Asp Glu Arg Trp Phe Lys Leu Tyr lie Lys His Val Val Asp Asn Asn 
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1570 1575 . 1580 

TCA CAT ATT ATC TAT TCA GGC CAG CTA ACA GAT ACA AAT ATA AAC ATC 4 8 00 
Ser His lie He Tvr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He 
1585 * 1590 1595 1600 

ACA TTA TTT ATT CCT CTT GAT GAT GTC CCA TTG AAT CAA GAT TAT CAC 4 84 8 
Thr Leu Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His 
1605 1610 1615 

GCC AAG GTT TAT ATG ACC TTC AAG AAA TCA CCA TCA GAT GGT ACC TGG 4 8 96 
Ala Lys Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp 
1620 1625 1630 

TGG GGC CCT CAC TTT GTT AGA GAT GAT AAA GGA ATA GTA ACA ATA AAC 4 944 
Trp Gly Pro His Phe Val Arg Asp Asp Lys Gly He Val Thr He Asn 
1635 1640 1645 

CCT AAA TCC ATT TTG ACC CAT TTT GAG AGC GTC AAT GTC CTG AAT AAT 4 992 
Pro Lys Ser He Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 
1650 1655 1660 

ATT AGT AGC GAA CCA ATG GAT TTC AGC GGC GCT AAC AGC CTC TAT TTC 504 0 
He Ser Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 
1665 1670 1675 1680 

TGG GAA CTG TTC TAC TAT ACC CCG ATG CTG GTT GCT CAA CGT TTC CTG 5088 
Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arq Leu Leu 
1685 1690 1695 

CAT GAA CAG AAC TTC GAT GAA GCC AAC CGT TGG CTG AAA TAT GTC TGG 5136 
His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp 
1700 1705 1710 

AGT CCA TCC GGT TAT ATT GTC CAC GGC CAG ATT CAG AAC TAC CAG TGG 518 4 
Ser Pro Ser Gly Tyr He Val His Gly Gin He Gin Asn Tyr Gin Trp 
1715 1720 1725 

AAC GTC CGC CCG TTA CTG GAA GAC ACC AGT TGG AAC AGT GAT CCT TTG 5232 
Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asd Pro Leu 
1730 1735 1740 

GAT TCC GTC GAT CCT GAC GCG GTA GCA CAG CAC GAT CCA ATG CAC TAC 528 0 
Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr 
1745 1 1750 1755 1760 

AAA GTT TCA ACT TTT ATG CGT ACC TTG GAT CTA TTG ATA GCA CGC GGC 5328 
Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu He Ala Arg Gly 
1765 1770 1775 

GAC CAT GCT TAT CGC CAA CTG GAA CGA GAT ACA CTC AAC GAA GCG AAG 537 6 
Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 
1780 1785 1790 

ATG TGG TAT ATG CAA GCG CTG CAT CTA TTA GGT GAC AAA CCT TAT CTA 54 2 4 
Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pre Tyr Leu 
1795 1800 1805 

CCG CTG AGT ACG ACA TGG AGT GAT CCA CGA CTA GAC AGA GCC GCG GAT 54 72 
Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 
1810 1815 1820 

ATC ACT ACC CAA AAT GCT CAC GAC AGC GCA ATA GTC GCT CTG CGG CAG 5520 
lie Thr Thr Gin Asn Ala His Asp Ser Ala He Val Aia. Leu Arq Gin 
1825 1830 1835 1840 

AAT ATA CCT ACA CCG GCA CCT TTA TCA 554 7 
Asn He Pro Thr Pro Ala Pro Leu Ser 
1845 1849 
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(2) INFORMATIotFTOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1849 amino acids 
5 (B) TYPE: amino acids 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 (TcdAii) 



15 



20 

Leu lie Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 
15 10 15 

Ala Pre Gly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr 

25 20 25 30 

Glu Leu Tyr Ara Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr 
35 40 45 

30 Tyr Leu Asp Thr Arg Arg Pro Asp Leu Lys Ser Met Ala Leu Ser Gin 
50 55 60 



Features 


From 


To 


Description 




Peptide 


1 


1849 


TcdAn pept 


ide 




Fragment 


1 


12 


TcdAii N- terminus (SEQ ID NO: 13) 


Fragment 


196 


211 


(SEQ 


ID 


NO: 


38) 




Fragment 


466 


475 


(SEQ 


ID 


NO: 


17) 




Fragment 


993 


1004 


(SEQ 


ID 


NO: 


23; 


12/13) 


Fragment 


1297 


1312 


(SEQ 


ID 


NO: 


18) 




Fragment 


1390 


1409 


(SEQ 


ID 


NO: 


39) 




Fragment 


1532 


1554 


(SEQ 


ID 


NO: 


21; 


19/23) 



Gin Asr. Met Asp lie Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu 

65 * 70 75 SO 

Leu Leu Glu Ser lie Lys Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys 
85 90 95 



Val Me: Glu Met Leu Ser Thr Phe Arg Pro Ser Gly Ala Thr Pro Tvr 
40 100 105 110 

His Asc Ala Tyr Glu Asn Val Arg Glu Val lie Gin Leu Gin Asp Pro 
115 120 125 

4 5 Gly Leu Glu Gin Leu Asn Ala Ser Pro Ala lie Ala Gly Leu Met His 
130 135 140 



Gin Ala Ser Leu Leu Gly lie Asn Ala Ser lie Ser Pro Glu Leu Phe 

145 150 155 160 

Asn lie Leu Thr Glu Glu lie Thr Glu Gly Asn Ala Glu Glu Leu Tyr 

165 170 175 



Lys Lys Asn Phe Gly Asn lie Glu Pro Ala Ser Leu Ala Met Pro Glu 
55 180 185 190 

Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe 
195 200 205 

60 lie Gly Lys Ala Ser Asn Phe Gly Gin Gin Glu Tyr Ser Asn Asn Gin 
210 215 220 



Leu lis Thr Pro Val Val Asn Ser Ser Asp Gly Thr Val Lys Val Tyr 
225 230 235 240 

Arg lie Thr Ara Glu Tyr Thr Thr Asn Ala Tyr Gin Met Asd Val Glu 
245 250 255 
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Leu Phe Pro Phe Gly^^ Glu Asn Tyr Arg.Leu Asp Tyr L^^^he Lys 

2 60 ~ 2 65 27 0 

Asn Phe Tyr Asn Ala Ser Tyr Leu Ser lie Lys Leu Asn Asp Lys Arg 
5 275 280 285 

Glu Leu Val Arg Thr Glu Gly Ala Pro Gin Val Asn lie Glu Tyr Ser 
290 295 300 

10 Ala Asn lie Thr Leu Asn Thr Ala Asp lie Ser Gin Pro Phe Glu lie 

305 310 315 320 



lb 



30 



45 



60 



Gly Leu Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala 

325 330 335 

Lys Phe Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu 
340 345 350 



Asn Lys Ala lie Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr He 

20 355 360 365 

Leu Glu Gly He Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr 

370 375 380 

2 5 Asp Val Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr 

335 390 395 400 



Ala He His Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro lie Ser 
405 410 415 

Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn 
420 425 430 



Thr Pro Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He 

35 435 440 445 

Asp Leu Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr He Leu Lys 

450 455 460 

4 0 Arq Ala Phe Asn He Asp Asp Val Ser Leu Phe Arg Leu Leu Lys lie 

465 470 475 480 



Thr Asd His Asp Asn Lys Asp Gly Lys lie Lys Asn Asn Leu Lys Asn 
485 490 495 

Leu Ser Asn Leu Tyr lie Gly Lys Leu Leu Ala Asp lie His Gin Leu 

500 505 510 



Thr lie Asp Glu Leu Asp Leu Leu Leu lie Ala Val Gly Glu Gly Lys 

50 515 520 525 

Thr Asn Leu Ser Ala He Ser Asp Lys Gin Leu Ala Thr Leu lie Arg 

530 535 540 

55 Lys Leu Asn Thr lie Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val 
545 550 555 560 



Phe Gin Leu Phe lie Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr 

565 570 575 

Pro Glu lie Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly 

580 585 590 



Phe Asp Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr lie 
65 595 600 605 

Ala Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu 
610 615 620 

70 Leu Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu 
625 630 635 640 
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30 



45 



60 



Lys Phe Trp Asd TrD Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu 
64 5 ' 650 " 65 5 

Ala Val Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala 
660 665 670 

Gin Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe 
675 680 685 

Arg Leu Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala 
690 695 700 



Ala Pro Ala His Asp Ala Leu Ser Leu He Met Leu Thr Arg Phe Ala 
15 705 710 715 720 

Asp Tro Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala 
725 730 735 

Phe Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn 
20 740 745 750 

Leu Aso Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His 
755 760 765 

2 5 Gin His Leu Pre Pro Val Thr Pro Glu Asn Ala Phe Scr Cvs Trp Thr 
770 775 780 



Ser He Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 

785 790 795 800 

Val Ala Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr He Gin 

805 810 815 



Ser Met Lys Glu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly 
35 620 825 830 

Val Leu Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Ala 
835 840 845 

4 0 Phe Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr lie Arg 
850 855 860 



Gin Val Ala Lys Ala Ala Ala Ala lie Lys Ser Arg Asp Aso Leu Tyr 

865 870 875 880 

Gin Tvr Leu Leu lie Asp Asn Gin Val Ser Ala Ala lie Lvs Thr Thr 

885 890 895 



Arg lie Ala Glu Ala He Ala Ser He Gin Leu Tyr Val Asn Arg Ala 

50 900 905 910 

Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val lie Ser Arg Gin 

915 920 ' 925 

55 Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 

930 935 940 



Gly Val Ser -Gin Leu Val Tyr Tyr Pro Glu Asn Tyr lie Asp Pro Thr 

945 950 955 960 

Met Arg lie Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 

965 970 975 



Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser 
65 980 985 5 990 

Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala 
995 1000 1005 

7 0 Tyr His Asp Asn lie Asn Asn Asp Gin Gly Leu Thr Tyr Phe He Glv 
1C10 1015 1020 
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p Ala 



Leu Ser Glu Thr hspKla Gly Glu Tyr Tyr* Trp Arg Ser Val Asp His 
1025 1030 1035 1040 

Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 
1045 1050 * 1055 

His Lys lie Asp Cys Pro lie Asn Pro Tyr Lys Ser Thr lie Arg Pro 
1060 1065 1070 

Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 
1075 1080 1085 

He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 
15 1090 1095 ^ 1100 

Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 
1105 1110 1115 1120 

Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu 
1125 1130 1135 

Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 
1140 1145 1150 

Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 
1155 1160 1165 

Asp Ser Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr lie Phe Ala Asp 
30 1170 1175 1180 

Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 
1185 1190 1195 1200 

35 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 

1205 1210 1215 

Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 
1220 1225 1230 

40 

Asp Tyr Gly Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp 

1235 1240 1245 

He Pro Thr He Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys He Tyr 
45 1250 1255 1260 

He Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lys 
1265 1270 1275 ' 1280 

50 Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys 

1285 1290 1295 

Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 
55 1300 1305 1310 

Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 
1315 1320 1325 

Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 
60 1330 1335 1340 

Lys Val Glu Ala Trp He Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 
1345 1350 1355 1360 

65 Asn Ala Ala He Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 

1365 1370 1375 

Asp Asp Leu Lys Gin Tyr He Phe Met Thr Asp Ser Lys Gly Thr Ala 
1380 1385 1390 



Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala He Ser Pro Ala 
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35 



50 



1395 1400 



Lys Val Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 
1410 1415 ' 1420 

5 

Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asp Glu Met 
1425 1430 1435 1440 

Asn Tyr Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe 
10 1445 1450 1455 

He Asn Asn Ser Ala Ser lie Asp Val Thr Phe Thr Ala Phe Ala Glu 
1460 1465 1470 

15 Asp Gly Arg Lvs Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu 
1475 * 1480 1485 



65 



Lys Val Ser Thr Asp Asn Ala Leu Thr Leu his His Asn Glu Asn Gly 
1490 1495 1500 

Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arq Thr Arg Leu Asn Thr Leu 
1505 1510 1515 1520 



Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He 
25 1525 1530 1535 

Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly Lvs Gly 
1540 1545 1550 

30 Phe Tyr Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly 
1555 1560 1565 



Asp Glu Arg Trp Phe Lys Leu Tyr He Lys His Val Val Asp Asn Asn 
1570 1575 1560 

Ser His He He Tyr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He 
1585 1590 1595 1600 



Thr Leu Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin Asd Tyr His 
40 1605 1610 1615 

Ala Lys Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp 
1620 1625 1630 

45 Trp Gly Pro His Phe Val Arg Asp Asp Lys Gly He Val Thr He Asn 
1635 1640 1645 



Pro Lys Ser He Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 
1650 1655 1660 

lie Ser Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 

1665 1670 1675 1680 



Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu 
55 1685 1690 1695 

His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp 
1700 1705 1710 

60 Ser Pro Ser Gly Tyr He Val His Gly Gin He Gin Asn Tyr Gin Tro 
1715 1720 1725 



Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Ttd Asn Ser Asp Pro Leu 
1730 1735 1740 

Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tvr 
1745 1750 1755 1760 



Lys Val Ser Thr Phe Met Arq Thr Leu Asd Leu Leu He Ala Arq Glv 
70 1765 1770 1775 
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60 



Asp His Ala Tvr Arg Leu Glu Arg Asp Thr Leu Asn G^^ftla Lys 

1780 1785 1790 

Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 

5 1795 1800 1805 

Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 

1810 1815 1820 

10 lie Thr Thr Gin Asn Ala His Asp Ser Ala lie Val Ala Leu Arg Gin 

1825 1830 1835 184C 



Asn lie Pro Thr Pro Ala Pro Leu Ser 
1845 1849 



(2) INFORMATION FOR SEQ ID NO: 50: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 1740 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(li) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 (tcdAm coding 
region) : 



TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC AAT 4 8 

30 Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie Asn 
15 10 15 

GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT CAG AGA GTA TAC AAT 9 6 

Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 
35 20 25 30 

CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA ATC 144 

Leu Ara His Asn Leu Ser lie Asp Gly Gin Pro Leu Tyr Leu Pro lie 
35 40 45 

40 

TAT GCC ACA CCG GCC GAT CCG AAA GCG TTA CTC AGC GCC GCC GTT GCC 192 

Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ala 

50 55 60 

4 5 ACT TCT CAA GGT GGA GGC AAG CTA CCG GAA TCA TTT ATG TCC CTG TGG 24 0 

Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu Trp 
65 70 75 80 

CGT TTC CCG CAC ATG CTG GAA AAT GCG CGC GGC ATG GTT AGC CAG CTC 28 8 

50 Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin Leu 

85 90 95 

ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC GCG 33 6 

Thr Gin Phe Gly Ser Thr Leu Gin Asn lie lie Glu Arg Gin Asp Ala 
55 100 105 110 

GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTG ATA TTC 38 4 

Glu Aia Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He Leu 
115 120 125 



ACT AAC CTG AGC ATT CAG GAC AAA ACC ATT GAA GAA TTG GAT GCC GAG 4 32 
Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu Asp Ala Glu 
130 135 140 



65 AAA ACG GTG TTG GAA AAA TCC AAA GCG GGA GCA CAA TCG CGC TTT GAT 4 80 

Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe Asp 
145 150 155 160 

AGC TAC GGC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA AAC CAA 52 8 
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Tyr Asp Glu Asn lie Asn Ala Gly Glu Asn Gin 

165 170 ... 175 

GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT CAG 57 6 

5 Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val Gin 
180 185 190 

GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC TTC 62 4 

Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn lie Phe 

10 195 200 205 



15 



35 



55 



GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG AC A 67 2 

Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala lie Ala Glu Ala Thr 
210 215 220 

GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG GAT 7 20 

Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala Asp 

225 230 235 240 



20 AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG GAG 7 68 

Lys lie Ser Gin Ser Glu Thr Tyr Arg Arq Arg Arq Gin Glu Trp Glu 

245 250 255 

ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT CAG 816 

25 lie Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin lie Asd Ala Gin 

260 265 270 

CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA ACC 8 64 

Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys Thr 
30 275 280 285 

AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC CTG 912 
Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe Leu 
290 295 300 



CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT GGT CGA 96 0 
Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly Arg 
305 310 ' 315 320 



4 0 CTG GCG GCG ATT TAC TTC CAG TTC TAC GAT TTG GCC GTC GCG CGT TGC 1008 
Leu Ala Ala lie Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg Cys 
325 330 335 

CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT GCC 1056 
4 5 Leu Met Ala Glu Gin Ala Tyr Arg Trv Glu Leu Asn Asd Asd Ser Ala 
340 345 350 

CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG CTT 1104 
Arg Phe lie Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 
50 ~ 355 360 365 

GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT CAT 1152 
Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala His 
370 375 380 



CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC ACA GTA TCG CTG 1200 
Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arq Thr Val Ser Leu 
385 390 395 400 



60 GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC CTG 124 8 
Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu 
405 410 415 

GCT CAG GAA ATT GAC AAG CTG GTG AGT CAA GGT TCA GGC AGT GCC GGC 1296 
65 Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala Gly 
420 425 430 

AGT GGT AAT AAT AAT TTG GCG TTC GGC GCC GGC ACG GAC ACT AAA ACC 134 4 
Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Glv Thr Asp Thr Lys Thr 
70 435 440 445 
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60 



m 

CA TTC GCT GAT TTG AAA ATT CG 



TCT TTG CAG GCA TCA G^^TCA TTC GCT GAT TTG AAA ATT CG*^\ GAT 13 92 

Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys lie Arg Glu Asp 
450 455 460 

5 TAC CCG GCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC GTC 14 40 

Tyr P^o Ala Ser Leu Glv Lys lie Arg Arg lie Lys Gin lie Ser Val 
465 470 475 480 

ACT TTG CCC GCG CTA CTG GGA CCG TAT CAG GAT GTA CAG GCA ATA TTG 14 88 

10 Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala lie Leu 

485 490 495 

TCT TAC GGC GAT AAA GCC GGA TTA GCT AAC GGC TGT GAA GCG CTG GCA 15 36 

Ser Tvr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala 

15 * 500 505 510 

GTT TCT CAC GGT ATG AAT GAC AGC GGC CAA TTC CAG CTC GAT TTC AAC 158 4 

Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn 
515 520 525 



GAT GGC AAA TTC CTG CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC ACG 1632 
Asp Gly Lys Phe Leu Pro Phe Glu Glv He Ala He Asp Gin Gly Thr 
530 535 540 



2 5 CTG ACA CTG AGC TTC CCA AAT GCA TCT ATG CCG GAG AAA GGT AAA CAA 168 0 
Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lvs Gin 
545 550 555 560 

GCC ACT ATG TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC TAC 17 28 
30 Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg Tyr 

565 570 575 

ACC ATT AAA TAA 17 40 
Thr lie Lys • • • 

35 579 

(2) INFORMATION FOR SEQ ID NO: 51: 

4 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
4b (ii) MOLECULE TYPE: protein 

(xi ) SEQU ENCE DESCRIPTION: SEQ ID NO: 51 (TcdAiii): 

Leu A**q Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin— Il£_Asn 
50 1 5 10 15 

Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 
20 " 25 30 

55 Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro He 
35 40 45 



Tyr £-a Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ala 
50 55 60 

Thr Se^ Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu Trp 
65 1 70 75 80 

Arg r ^e Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin Leu 
65 85 90 95 

Thr r. Phe Gly Ser Thr Leu Gin Asn II- Ho Glu Arg Gin Asp Ala 
100 105 110 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



Glu Ala Leu Asi^Ua Leu Leu Gin Asn Gin-Ala Ala Glu Leu lie Leu 
115 120 121) 

Thr Asn Leu Ser lie Gin Asp Lys Thr lie Glu Glu Leu Asd Ala Glu 
130 135 140 

Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe Asd 
145 150 155 160 

Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn Gin 
165 170 175 

Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val Gin 
180 185 190 

Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He Phe 
195 200 205 

Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala Thr 
210 215 220 

Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala Asp 
225 230 235 240 

Lys He Ser Gin Ser Glu Thr Tyr Arg Arq Arg Arq Gin Glu Tro Glu 
245 250 255 

lie Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala Gin 
260 265 270 

Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys Thr 
275 280 285 

Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe Leu 
290 295 300 

Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly Ara 
305 310 315 320 

Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arq Cvs 
325 330 335 

Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asd Ser Ala 
340 345 350 

Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 
355 360 365 

Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala His 
370 375 380 

Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 
385 390 395 400 

Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu 
405 410 415 

Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala Gly 
420 425 430 

Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys Thr 
435 440 445 

Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu Asd 
450 455 460 

Tyr Pro Ala Ser Leu Glv Lys He Arg Arg He Lvs Gin lie Ser Val 
465 470 475 480 

Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala lie Leu 



485 



4 90 



495 
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Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala 

500 505 510 

Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asd Phe Asn 

515 520 525 

Asp Gly Lys Phe Leu Pro Phe Glu Gly lie Ala He Asd Gin Glv Thr 

530 535 540 

Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys Gin 

545 550 555 ' 560 



Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg Tyr 
15 565 570 575 



Thr He Lys ••• 

579 



(2) INFORMATION FOR SEQ ID NO: 52: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5532 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 {tcbAa coding 

region) : 

TTT ATA CAA GGT TAT AGT GAT CTG TTT GGT AAT CGT GCT GAT AAC TAT 4 8 

Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr 
35 1 5 10 15 

GCC GCG CCG GGC TCG GTT GCA TCG ATG TTC TCA CCG GCG GCT TAT TTG 96 
Ala Ala Pro Gly Ser Val Ala Ser Met Phe Ser Pro Ala Ala Tyr Leu 
20 25 30 



ACG GAA TTG TAC CGT GAA GCC AAA AAC TTG CAT GAC AGC AGC TCA ATT 14 4 
Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser He 
35 40 45 



4 5 TAT TAC CTA GAT AAA CGT CGC CCG GAT TTA GCA AGC TTA ATG CTC AGC 192 
Tyr Tyr Leu Asd Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser 
50 55 60 

CAG AAA AAT ATG GAT GAG GAA ATT TCA ACG CTG GCT CTC TCT AAT GAA 24 0 
50 Gin Lys Asn Met Asp Glu Glu He Ser Thr Leu Ala Leu Ser Asn Glu 
65 70 75 80 

TTG TGC CTT GCC GGG ATC GAA ACA AAA ACA GGA AAA TCA CAA GAT GAA 288 
Leu Cys Leu Ala Gly He Glu Thr Lys Thr Gly Lys Ser Gin Asp Glu 
55 85 90 95 

GTG ATG GAT ATG TTG TCA ACT TAT CGT TTA AGT GGA GAG ACA CCT TAT 336 

Val Met Asp Met Leu Ser Thr Tyr Arg Leu Ser Gly Glu Thr Pro Tyr 
100 105 110 

60 

CAT CAC GCT TAT GAA ACT GTT CGT GAA ATC GTT CAT GAA CGT GAT CCA 38 4 

His His Ala Tyr Glu Thr Val Arg Glu He Val His Glu Ara Asp Pro 
115 120 125 

6 5 GGA TTT CGT CAT TTG TCA CAG GCA CCC ATT GTT GCT GCT AAG CTC GAT 4 32 
Gly Phe Arg His Leu Ser Gin Ala Pro He Val Ala Ala Lys Leu Asp 
130 135 140 

CCT GTG ACT TTG TTG GGT ATT AGC TCC CAT ATT TCG CCA GAA CTG TAT 4 80 
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Pro Val Thr LeoKu Gly He Ser Ser His He Ser wo Glu Leu Tyr 
145 150 '155 160 

AAC TTG CTG ATT GAG GAG ATC CCG GAA AAA GAT GAA GCC GCG CTT GAT 52 6 

5 Asn Leu Leu He Glu Glu He Pro Glu Lys Asp Glu Ala Ala Leu Asp 

165 170 175 

ACG CTT TAT AAA ACA AAC TTT GGC GAT ATT ACT ACT GCT CAG TTA ATG 57 6 

Thr Leu Tyr Lys Thr Asn Phe Gly Asp He Thr Thr Ala Gin Leu Met 

10 180 185 190 

TCC CCA AGT TAT CTG GCC CGG TAT TAT GGC GTC TCA CCG GAA GAT ATT 624 

Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp He 
195 200 205 



GCC TAC GTG ACG ACT TCA TTA TCA CAT GTT GGA TAT AGC AGT GAT ATT 67 2 
Ala Tvr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp He 
210 215 220 



20 CTG GTT ATT CCG TTG GTC GAT GGT GTG GGT AAG ATG GAA GTA GTT CGT 72 0 

Leu Val He Pro Leu Val Asp Gly Val Gly Lys Met Glu Val Val Arg 

225 230 235 240 

GTT ACC CGA ACA CCA TCG GAT AAT TAT ACC AGT CAG ACG AAT TAT ATT 7 68 

2 5 Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr lie 

245 250 255 

GAG CTG TAT CCA CAG GGT GGC GAC AAT TAT TTG ATC AAA TAC AAT CTA 816 

Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr Leu He Lys Tyr Asn Leu 
30 260 265 270 

AGC AAT AGT TTT GGT TTG GAT GAT TTT TAT CTG CAA TAT AAA GAT GGT 8 64 

Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp Gly 
275 280 285 



TCC GCT GAT TGG ACT GAG ATT GCC CAT AAT CCC TAT CCT GAT ATG GTC 912 
Ser Ala Asp Trp Thr Glu lie Ala His Asn Pro Tyr Pro Asp Met Val 
290 295 300 



4 0 ATA AAT CAA AAG TAT GAA TCA CAG GCG ACA ATC AAA CGT AGT GAC TCT 960 

He Asn Gin Lys Tyr Glu Ser Gin Ala Thr He Lys Arq Ser Asp Ser 

305 310 315 320 

GAC AAT ATA CTC AGT ATA GGG TTA CAA AGA TGG CAT AGC GGT AGT TAT 1008 

4 5 Asp Asn He Leu Ser He Gly Leu Gin Arg Trp His Ser Gly Sor Tyr 

325 330 335 

AAT TTT GCC GCC GCC AAT TTT AAA ATT GAC CAA TAC TCC CCG AAA GCT 10 56 

Asn Hhu Ala Ala Ala Asn Phe Lys He Asp Gin Tyr Ser Pro Lys Ala 

50 340 345 350 

TTC CTG CTT AAA ATG AAT AAG GCT ATT CGG TTG CTC AAA GCT ACC GGC 1104 

Phe Leu Leu Lys Met Asn Lys Ala He Arg Leu Leu Lys Ala Thr Gly 
355 360 365 



CTC TCT TTT GCT ACG TTG GAG CGT ATT GTT GAT AGT GTT AAT AGC ACC 2152 
Leu Ser Phe Ala Thr Leu Glu Arq lie Val Asp Ser Val Asn Ser Thr 
370 375 380 



60 AAA TCC ATC ACG GTT GAG GTA TTA AAC AAG GTT TAT CGG GTA AAA TTC 12 00 

Lys Ser He Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 
385 390 395 400 

TAT ATT GAT CGT TAT GGC ATC AGT GAA GAG ACA GCC GCT ATT TTG GCT 124 8 

65 Tyr He Asp Arg Tyr Gly He Ser Glu Glu Thr Ala Ala He Leu Ala 

405 410 415 

AAT ATT AAT ATC TCT CAG CAA GCT GTT GGC AAT CAG CTT AGC CAG TTT 1296 

Asn He Asn He Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 
70 420 425 4 30 



-248- 

SUBSTTTUTE SHEET (i -E 26) 



BNSDOCID: <WO_9808932AU_> 



WO 98/08932 



PCT/US97/07657 



GAG CAA CTA TTT AAT CCG CCG CTC AAT GGT ATT CGC T AA ATC 134 4 

Glu Gin Leu Phe Asn His Pro Pro Leu Asn "Gly lie Arg Tvr Glu lie 
435 440 445 

5 AGT GAG GAC AAC TCC AAA CAT CTT CCT AAT CCT GAT CTG AAC CTT AAA 1392 

Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 

450 455 460 

CCA GAC AGT ACC GGT GAT GAT CAA CGC AAG GCG GTT TTA AAA CGC GCG 1440 

10 Pro Asp Ser Thr Gly Asp Asp Gin Arg Lys Ala Val Leu Lys Arg Ala 

465 470 475 480 

TTT CAG GTT AAC GCC AGT GAG TTG TAT CAG ATG TTA TTG ATC ACT GAT 14 88 

Phe Gin Val Asn Ala Ser Glu Leu Tyr Gin Met Leu Leu He Thr Asp 

15 485 490 495 

CGT AAA GAA GAC GGT GTT ATC AAA AAT AAC TTA GAG AAT TTG TCT GAT 153 6 

Arg Lys Glu Asp Gly Val He Lys Asn Asn Leu Glu Asn Leu Ser Asp 
500 505 510 



20 



40 



60 



CTG TAT TTG GTT AGT TTG CTG GCC CAG ATT CAT AAC CTG ACT ATT GCT 158 4 
Leu Tyr Leu Val Ser Leu Leu Ala Gin He His Asn Leu Thr He Ala 
515 520 525 



2 5 GAA TTG AAC ATT TTG TTG GTG ATT TGT GGC TAT GGC GAC ACC AAC ATT 1632 
Glu Leu Asn He Leu Leu Val lie Cys Gly Tyr Gly Asp Thr Asn He 
530 535 540 

TAT CAG ATT ACC GAC GAT AAT TTA GCC AAA ATA GTG GAA AC A TTG TTG 168 0 
30 Tyr Gin He Thr Asp Asp Asn Leu Ala Lys He Val Glu Thr Leu Leu 
545 550 555 560 

TGG ATC ACT CAA TGG TTG AAG ACC CAA AAA TGG ACA GTT ACC GAC CTG 172 8 
Trp lie Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 
35 565 570 1 575 

TTT CTG ATG ACC ACG GCC ACT TAC AGC ACC ACT TTA ACG CCA GAA ATT 177 6 
Phe Leu Met Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Glu He 
580 585 590 



AGC AAT CTG ACG GCT ACG TTG TCT TCA ACT TTG CAT GGC AAA GAG AGT 182 4 
Ser Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Glu Ser 
595 600 605 



AS CTG ATT GGG GAA GAT CTG AAA AGA GCA ATG GCG CCT TGC TTC ACT TCG 187 2 

Leu He Gly Glu Asp Leu Lys Arg Ala Met Ala Pro Cys Phe Thr Ser 

610 615 620 

GCT TTG CAT TTG ACT TCT CAA GAA GTT GCG TAT GAC CTG CTG TTG TGG 1920 

50 Ala Leu His Leu Thr Ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Trp 

625 630 635 640 

ATA GAC CAG ATT CAA CCG GCA CAA ATA ACT GTT GAT GGG TTT TGG GAA 1968 

He Asp Gin He Gin Pro Ala Gin He Thr Val Asp Gly Phe Trp Glu 

55 645 650 655 

GAA GTG CAA ACA ACA CCA ACC AGC TTG AAG GTG ATT ACC TTT GCT CAG 2016 

Glu Val Gin Thr Thr Pro Thr Ser Leu Lys Val He Thr Phe Ala Gin 

660 665 670 



GTG CTG GCA CAA TTG AGC CTG ATC TAT CGT CGT ATT GGG TTA AGT GAA 2 064 
Val Leu Ala Gin Leu Ser Leu lie Tyr Arg Arg lie Gly Leu Ser Glu 
675 680 " 685 



65 ACG GAA CTG TCA CTG ATC GTG ACT CAA TCT TCT CTG CTA GTG GCA GGC 2112 
Thr Glu Leu Ser Leu He Val Thr Gin Ser Ser Leu Leu Val Ala Gly 
690 695 700 

AAA AGC ATA CTG GAT CAC GGT CTG TTA ACC CTG ATG GCC TTG GAA GGT 2160 
70 Lys Ser lie Leu Asp His Gly Leu Leu Thr Leu Met Ala Leu Glu Gly 
705 710 715 720 
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TTT CAT ACC TGG GTT AAT GGC TTG GGG CAA *CAT GCC TCC TTG ATA TTG 2238 
Phe His Thr Trp Val Asn Gly Leu Gly Gin His Ala Ser Leu lie Leu 
725 730 735 

GCG GCG TTG AAA GAC GGA GCC TTG ACA GTT ACC GAT GTA GCA CAA GCT 2256 
Ala Ala Leu Lys Asp Gly Ala Leu Thr Val Thr Asp Val Ala Gin Ala 
740 745 750 

ATG AAT AAG GAG GAA TCT CTC CTA CAA ATG GCA GCT AAT CAG GTG GAG 2 304 
Met Asn Lys Glu Glu Ser Leu Leu Gin Met Ala Ala Asn Gin Val Glu 
755 760 765 

AAG GAT CTA ACA AAA CTG ACC AGT TGG ACA CAG ATT GAC GCT ATT CTG 2 352 
15 Lys Asp Leu Thr Lys Leu Thr Ser Trp Thr Gin He Asp Ala He Leu 
770 775 780 

CAA TGG TTA CAG ATG TCT TCG GCC TTG GCG GTT TCT CCA CTG GAT CTG 24 00 
Gin Trp Leu Gin Met Ser Ser Ala Leu Ala Val Ser Pro Leu Asp Leu 
20 785 790 795 800 

GCA GGG ATG ATG GCC CTG AAA TAT GGG ATA GAT CAT AAC TAT GCT GCC 24 4 8 

Ala Gly Met Met Ala Leu Lys Tyr Gly He Asp His Asn Tyr Ala Ala 

805 810 815 

<— D 

TGG CAA GCT GCG GCG GCT GCG CTG ATG GCT GAT CAT GCT AAT CAG GCA 2*96 

Trp Gin Ala Ala Ala Ala Ala Leu Met Ala Asd His Ala Asn Gin Ala 

820 825 830 

30 CAG AAA AAA CTG GAT GAG ACG TTC AGT AAG GCA TTA TGT AAC TAT TAT 2 54 4 
Gin Lys Lys Leu Asp Glu Thr Phe Ser Lys Ala Leu Cys Asn Tyr Tyr 
835 840 845 

ATT AAT GCT GTT CTC GAT AGT GCT GCT GGA GTA CGT GAT CGT AAC GGT 2592 
35 He Asn Ala Val Val Asp Ser Ala Ala Gly Val Arq Asp Arq Asn Gly 
850 855 860 

TTA TAT ACC TAT TTG CTG ATT GAT AAT CAG GTT TCT GCC GAT GTG ATC 2 64 0 
Leu Tyr Thr Tyr Leu Leu He Asp Asn Gin Val Ser Ala Asp Val He 
865 870 875 880 

ACT TCA CGT ATT GCA GAA GCT ATC GCC GGT ATT CAA CTG TAC GTT AAC 2688 
Thr Ser Arg He Ala Glu Ala He Ala Gly He Gin Leu Tyr Val Asn 
4 5 885 890 895 

CGG GCT TTA AAC CGA GAT GAA GGT CAG CTT GCA TCG GAC GTT AGT ACC ~7 3 6 
Arg Ala Leu Asn Arg Asp Glu Gly Gin Leu Ala Ser Asp Val Ser Thr 
900 905 910 

50 CGT CAG TTC TTC ACT GAC TGG GAA CGT TAC AAT AAA CGT TAC AGT ACT 278 4 
Arg Gin Phe Phe Thr Asp Trp Glu Arg Tyr Asn Lys Arg Tyr Ser Thr 
915 920 ^ 925 

TGG GCT GGT GTC TCT GAA CTG GTC TAT TAT CCA GAA AAC TAT GTT GAT 2832 
55 Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tvr Val Asp 
930 935 940 

CCC ACT CAG CGC ATT GGG CAA ACC AAA ATG ATG GAT GCG CTG TTG CAA 2 880 

rr, £ r ? Thr Gln Arg Iie Gly Gln Thr L V S Met Me t Asp Ala Leu Leu Gin 
60 945 950 955 960 

TCC ATC AAC CAG AGC CAG CTA AAT GCG GAT ACG GTG GAA GAT GCT TTC 2 928 
Ser He Asn Gln Ser Gln Leu Asn Ala Asp Thr Val Glu Asp Ala Phe 
965 970 975 

AAA ACT TAT TTG ACC AGC TTT GAG CAG GTA GCA AAT CTG AAA GTA ATT ?97 6 
Lys Thr Tyr Leu Thr Ser Phe Glu Gln Val Ala Asn Leu Lys Val He 
980 985 990 

AGT GCT TAC CAC GAT AAT GTG AAT GTG GAT CAA GGA TTA ACT TAT TT^ ^02 4 
Ser Ala Tyr His Asp Asn Val Asn Val Asp Gln Gly Leu Thr Tyr Phe 
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995 1000 1005 

ATC GGT ATC GAC CAA GCA GCT CCG GGT ACG TAT TAC TGG CGT AGT GTT 3072 

lie Gly lie Asp Gin Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val 

1010 1015 1020 

GAT CAC AGC AAA TGT GAA AAT GGC AAG TTT GCC GCT AAT GCT TGG GGT 3120 

Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 

1025 1030 1035 1040 

GAG TGG AAT AAA ATT ACC TGT GCT GTC AAT CCT TGG AAA AAT ATC ATC 3168 

Glu Trp Asn Lys lie Thr Cys Ala Val Asn Pro Trp Lys Asn lie lie 
1045 1050 1055 



15 CGT CCG GTT GTT TAT ATG TCC CGC TTA TAT CTG CTA TGG CTG GAG CAG 3216 

Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 
1060 1065 1070 

CAA TCA AAG AAA AGT GAT GAT GGT AAA ACC ACG ATT TAT CAA TAT AAC 32 64 

20 Gin Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr He Tyr Gin Tyr Asn 
1075 1080 1085 

TTA AAA CTG GCT CAT ATT CGT TAC GAC GGT AGT TGG AAT ACA CCA TTT 3312 

Leu Lys Leu Ala His He Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 
25 1090 1095 1100 

ACT TTT GAT GTG ACA GAA AAG GTA AAA AAT TAC ACG TCG AGT ACT GAT 3360 

Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 

1105 1110 1115 1120 



GCT GCT GAA TCT TTA GGG TTG TAT TGT ACT GGT TAT CAA GGG GAA GAC 34 08 
Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 
1125 1130 1135 



35 ACT CTA TTA GTT ATG TTC TAT TCG ATG CAG AGT AGT TAT AGC TCC TAT 34 5 6 

Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 
1140 1145 1150 

ACC GAT AAT AAT GCG CCG GTC ACT GGG CTA TAT ATT TTC GCT GAT ATG 3504 

40 Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr He Phe Ala Asp Met 
1155 1160 1165 

TCA TCA GAC AAT ATG ACG AAT GCA CAA GCA ACT AAC TAT TGG AAT AAC 3552 

Ser Ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn Asn 
45 1170 1175 1180 

AGT TAT CCG CAA TTT GAT ACT GTG ATG GCA GAT CCG GAT AGC GAC AAT 3600 

Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 

1185 1190 1195 1200 



AAA AAA GTC ATA ACC AGA AGA GTT AAT AAC CGT TAT GCG GAG GAT TAT 364 8 
Lys Lys Val He Thr Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 
1205 1210 1215 



55 GAA ATT CCT TCC TCT GTG ACA AGT AAC AGT AAT TAT TCT TGG GGT GAT 3 696 

Glu He Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp 
1220 1225 1230 

CAC AGT TTA ACC ATG CTT TAT GGT GGT AGT GTT CCT AAT ATT ACT TTT 37 4 4 

60 His Ser Leu Thr Met Leu Tyr Gly Gly Ser Val Pro Asn lie Thr Phe 
1235 1240 1245 

GAA TCG GCG GCA GAA GAT TTA AGG CTA TCT ACC AAT ATG GCA TTG AGT 37 92 

Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 

65 1250 1255 1260 

ATT ATT CAT AAT GGA TAT GCG GGA ACC CGC CGT ATA CAA TGT AAT CTT 38 4 0 

He He His Asn Gly Tyr Ala Gly Thr Arg Arg He Gin Cys Asn Leu 
1265 1270 1275 1280 



ATG AAA CAA TAC GCT TCA TTA GGT GAT AAA TTT ATA ATT TAT GAT TCA 388 8 
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Met Lys Gin T^^^la Ser Leu Gly Asp Lys Phe Ile^^ Tyr Asp Ser 

1285 1290 1295 

TCA TTT GAT GAT GCA AAC CGT TTT AAT CTG GTG CCA TTG TTT AAA TTC 3936 

5 Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 
1300 1305 1310 

GGA AAA GAC GAG AAC TCA GAT GAT AGT ATT TGT ATA TAT AAT GAA AAC 3984 

Gly Lys Asp Glu Asn Ser Asp Asp Ser lie Cys lie Tyr Asn Glu Asn 
10 1315 1320 1325 



15 



35 



55 



CCT TCC TCT GAA GAT AAG AAG TGG TAT TTT TCT TCG AAA GAT GAC AAT 4 032 
Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 
1330 1335 1340 

AAA ACA GCG GAT TAT AAT GGT GGA ACT CAA TGT ATA GAT GCT GGA ACC 4 080 
Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys lie Asp Ala Gly Thr 
1345 1350 1355 1360 



20 AGT AAC AAA GAT TTT TAT TAT AAT CTC CAG GAG ATT GAA GTA ATT AGT 4128 

Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu lie Glu Val lie Ser 
1365 1370 1375 

GTT ACT GGT GGG TAT TGG TCG AGT TAT AAA ATA TCC AAC CCG ATT AAT 417 6 

25 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys lie Ser Asn Pro He Asn 
1380 1385 1390 

ATC AAT ACG GGC ATT GAT AGT GCT AAA GTA AAA GTC ACC GTA AAA GCG 4 224 

He Asn Thr Gly He Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 
30 1395 1400 1405 

GGT GGT GAC GAT CAA ATC TTT ACT GCT GAT AAT AGT ACC TAT GTT CCT 4 272 

Gly Gly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 
1410 1415 1420 



CAG CAA CCG GCA CCC AGT TTT GAG GAG ATG ATT TAT CAG TTC AAT AAC 4 320 
Gin Gin Pro Ala Pro Ser Phe Glu Glu Met He Tyr Gin Phe Asn Asn 
1425 1430 1435 ' 1440 



4 0 CTG ACA ATA GAT TGT AAG AAT TTA AAT TTC ATC GAC AAT CAG GCA CAT 4 368 
Leu Thr He Asp Cys Lys Asn Leu Asn Phe He Asp Asn Gin Ala His 
1445 1450 1455 

ATT GAG ATT GAT TTC ACC GCT ACG GCA CAA GAT GGC CGA TTC TTG GGT 4 416 
4 5 He Glu He Asp Phe Thr Ala Thr Ala Gin Asd Gly Arg Phe Leu Gly 
1460 1465 1470 

GCA GAA ACT TTT ATT ATC CCG GTA ACT AAA AAA GTT CTC GGT ACT GAG 4 4 64 
Ala Glu Thr Phe He lie Pro Val Thr Lys Lys Val Leu Gly Thr Glu 
50 1475 1480 1485 

AAC GTG ATT GCG TTA TAT AGC GAA AAT AAC GGT GTT CAA TAT ATG CAA 4 512 
Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 
1490 1495 1500 



ATT GGC GCA TAT CGT ACC CGT TTG AAT ACG TTA TTC GCT CAA CAG TTG 4 560 
He Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 
1505 1510 1515 1520 



60 GTT AGC CGT GCT AAT CGT GGC ATT GAT GCA GTG CTC AGT ATG GAA ACT 4 608 

Val Ser Arg Ala Asn Arg Gly lie Asp Ala Val Leu Ser Met Glu Thr 

1525 1530 1535 

CAG AAT ATT CAG GAA CCG CAA TTA GGA GCG GGC ACA TAT GTG CAG CTT 4 656 

65 Gin Asn lie Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 

1540 1545 1550 

GTG TTG GAT AAA TAT GAT GAG TCT ATT CAT GGC ACT AAT AAA AGC TTT 4 704 

Val Leu Asp Lys Tyr Asp Glu Ser He His Gly Thr Asn Lys Ser Phe 

70 1555 1560 ' 1565 
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m 

ATA TTT AAA GAG AAC GAT AGT 



GCT ATT GAA TAT GTT^^f- ATA TTT AAA GAG AAC GAT AGT T^^GTG ATT 4 7 52 

Ala lie Glu Tyr Val Asp lie Phe Lys Glu Asn Asp Ser Phe Val lie 
1570 1575 1580 

5 TAT CAA GGA GAA CTT AGC GAA ACA AGT CAA ACT GTT GTG AAA GTT TTC 4 800 

Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 
1585 1590 1595 1600 

TTA TCC TAT TTT ATA GAG GCG ACT GGA AAT AAG AAC CAC TTA TGG GTA 4 84 8 

10 Leu Ser Tyr Phe lie Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val 

1605 1610 1615 

CGT GCT AAA TAC CAA AAG GAA ACG ACT GAT AAG ATC TTG TTC GAC CGT 4 8 96 

Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys lie Leu Phe Asp Arg 

15 1620 1625 1630 

ACT GAT GAG AAA GAT CCG CAC GGT TGG TTT CTC AGC GAC GAT CAC AAG 4 94 4 

Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 
1635 1640 1645 



ACC TTT AGT GGT CTC TCT TCC GCA CAG GCA TTA AAG AAC GAC AGT GAA 4 992 
Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 
1650 1655 1660 



2 5 CCG ATG GAT TTC TCT GGC GCC AAT GCT CTC TAT TTC TGG GAA CTG TTC 504 0 

Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 

1665 1670 1675 1680 

TAT TAC ACG CCG ATG ATG ATG GCT CAT CGT TTG TTG CAG GAA CAG AAT 5088 

30 Tyr Tyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Glu Gin Asn 

1685 1690 1695 

TTT GAT GCG GCG AAC CAT TGG TTC CGT TAT GTC TGG AGT CCA TCC GGT 5136 

Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly 
35 1700 1705 1710 

TAT ATC GTT GAT GGT AAA ATT GCT ATC TAC CAC TGG AAC GTG CGA CCG 5184 

Tyr lie Val Asd Gly Lys lie Ala lie Tyr His Trp Asn Val Arg Pro 

1715 1720 1725 



CTG GAA GAA GAC ACC AGT TGG AAT GCA CAA CAA CTG GAC TCC ACC GAT 5232 
Leu Glu Glu Asd Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 
1730 1735 1740 



4 5 CCA GAT GCT GTA GCC CAA GAT GAT CCG ATG CAC TAC AAG GTG GCT ACC 5280 

Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 

1745 1750 1755 1760 

TTT ATG GCG ACG TTG GAT CTG CTA ATG GCC CGT GGT GAT GCT GCT TAC 5328 

50 Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr 

1765 1770 1775 

CGC CAG TTA GAG CGT GAT ACG TTG GCT GAA GCT AAA ATG TGG TAT ACA 5 37 6 

Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp Tyr Thr 
55 1780 1785 1790 

CAG GCG CTT AAT CTG TTG GGT GAT GAG CCA CAA GTG ATG CTG AGT ACG 54 24 

Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 

1795 1800 1805 



ACT TGG GCT AAT CCA ACA TTG GGT AAT GCT GCT TCA AAA ACC ACA CAG 54 72 
Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 
1810 1815 1820 



65 CAG GTT CGT CAG CAA GTG CTT ACC CAG TTG CGT CTC AAT AGC AGG GTA 5 520 
Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 
1825 1830 1835 1840 

AAA ACC CCG TTG 5532 
70 Lys Thr Pro Leu 
1844 
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(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1844 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



15 



20 



25 

Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser lie 
35 40 45 

30 Tyr Tyr Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser 
50 55 60 



(ii) MOLECULE TYPE: 


protein 




(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53 (TcbAii): 


Features From 


To 


Description 


Peptide 1 


1844 


TcbAii peptide 


Fragment 1 


11 


(SEQ ID NO:l) 


Fragment 978 


990 


(SEQ ID NO:23) 


Fragment 1387 


1401 


(SEQ ID NO:22) 


Fragment 1484 


1505 


(SEQ ID NO:24) 


Fragment 1527 


1552 


(SEQ ID NO:21) 


He Gin Gly Tyr Ser 


Asp Leu Phe 


Gly Asn Arg Ala Asp Asn 


5 




10 15 


Ala Pro Gly Ser Val 


Ala Ser Met 


Phe Ser Pro Ala Ala Tyr 


20 


25 


30 



Gin Lys Asn Met Asp Glu Glu He Ser Thr Leu Ala Leu Ser Asn Glu 
65 -70 75 B0 

Leu Cys Leu Ala Gly He Glu Thr Lys Thr Gly Lys Ser Gin Asp Glu 
85 90 95 



Val Met Asp Met Leu Ser Thr Tyr Arg Leu Ser Gly Glu Thr Pro Tyr 
40 100 105 110 

His His Ala Tyr Glu Thr Val Arg Glu He Val His Glu Arg Asp Pro 

115 120 125 

4 5 Gly Phe Arg His Leu Ser Gin Ala Pro He Val Ala Ala Lys Leu Asp 
130 135 140 



Pro Val Thr Leu Leu Gly He Ser Ser His lie Ser Pro Glu Leu Tyr 
145 150 155 160 

Asn Leu Leu lie Glu Glu He Pro Glu Lys Asp Glu Ala Ala Leu Asp 
165 170 175 

Thr Leu Tyr Lys Thr Asn Phe Gly Asp lie Thr Thr Ala Gin Leu Met 

:S 180 185 190 

Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp lie 
195 200 205 

60 Ala Tyr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp He 
210 215 220 



Leu Val lie Pro Leu Val Asp Gly Val Gly Lys Met Glu Val Val Arg 

225 230 235 240 

Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr lie 

245 250 255 
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30 



45 



60 



Glu Leu Tyr Pro Gin Gly Asp Asn Tyr . Leu He Lys T^BKsn Leu 

260 265 270 

Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp Gly 

5 275 280 285 

Ser Ala Asp Trp Thr Glu He Ala His Asn Pro Tyr Pro Asp Met Val 

290 295 300 

10 He Asn Gin Lys Tyr Glu Ser Gin Ala Thr He Lys Arg Ser Asp Ser 

305 310 315 320 



Asp Asn He Leu Ser He Gly Leu Gin Arg Trp His Ser Gly Ser Tyr 

325 330 335 

Asn Phe Ala Ala Ala Asn Phe Lys lie Asp Gin Tyr Ser Pro Lys Ala 

340 345 350 



Phe Leu Leu Lys Met Asn Lys Ala lie Arg Leu Leu Lys Ala Thr Gly 

20 355 360 365 

Leu Ser Phe Ala Thr Leu Glu Arg lie Val Asp Ser Val Asn Ser Thr 

370 375 380 

2 5 Lys Ser lie Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 

385 390 395 400 



Tyr lie Asp Arg Tyr Gly lie Ser Glu Glu Thr Ala Ala lie Leu Ala 

405 410 415 

Asn He Asn lie Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 

420 425 430 



Glu Gin Leu Phe Asn His Pro Pro Leu Asn Gly He Arg Tyr Glu lie 

35 435 440 445 

Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 

450 455 460 

4 0 Pro Asp Ser Thr Gly Asp Asp Gin Arg Lys Ala Val Leu Lys Arg Ala 

465 470 475 480 



Phe Gin Val Asn Ala Ser Glu Leu Tyr Gin Met Leu Leu lie Thr Asp 
485 490 495 

Arg Lvs Glu Asd Gly Val He Lys Asn Asn Leu Glu Asn Leu Ser Asp 
500 505 510 



Leu Tyr Leu Val Ser Leu Leu Ala Gin lie His Asn Leu Thr lie Ala 
50 515 520 525 

Glu Leu Asn He Leu Leu Val lie Cys Gly Tyr Gly Asp Thr Asn lie 
530 535 540 

55 Tyr Gin lie Thr Asp Asp Asn Leu Ala Lys lie Val Glu Thr Leu Leu 
545 550 555 560 



Trp lie Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 

565 570 575 

Phe Leu Met Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Glu lie 

580 585 590 



Ser Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Glu Ser 
65 595 600 605 

Leu lie Gly Glu Asp Leu Lys Arg Ala Met Ala Pro Cys Phe Thr Ser 
610 615 620 

70 Ala Leu His Leu Thr Ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Tro 
625 630 635 64*0 
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lie Asp Gin lie Gin Pro Ala Gin He Thr Val Asp Gly Phe Trp Glu 
645 650 655 

Glu Val Gin Thr Thr Pro Thr Ser Leu Lys Val He Thr Phe Ala Gin 
660 665 670 

Val Leu Ala Gin Leu Ser Leu He Tyr Arg Arg He Gly Leu Ser Glu 
675 680 685 

Thr Glu Leu Ser Leu He Val Thr Gin Ser Ser Leu Leu Val Ala Gly 
690 695 700 



Lys Ser He Leu Asp His Gly Leu Leu Thr Leu Met Ala Leu Glu Gly 

15 705 710 715 720 

Phe His Thr TrD Val Asn Gly Leu Gly Gin His Ala Ser Leu lie Leu 

725 730 735 

20 Ala Ala Leu Lys Asp Gly Ala Leu Thr Val Thr Asp Val Ala Gin Ala 

740 745 750 



Met Asn Lys Glu Glu Ser Leu Leu Gin Met Ala Ala Asn Gin Val Glu 

755 760 765 

Lys Asp Leu Thr Lys Leu Thr Ser Trp Thr Gin He Asp Ala He Leu 

770 775 780 



Gin Tm Leu Gin Met Ser Ser Ala Leu Ala Val Scr Pro Leu Asp Leu 

30 785 790 795 800 

Ala Gly Met Met Ala Leu Lys Tyr Gly He Asp His Asn Tvr Ala Ala 

805 810 615 

35 Trp Gin Ala Ala Ala Ala Ala Leu Met Ala Asp His Ala Asn Gin Ala 

820 825 830 



Gin Lys Lys Leu Asp Glu Thr Phe Ser Lys Ala Leu Cys Asn Tyr Tyr 

835 840 845 

lie Asn Ala Val Val Asp Ser Ala Ala Gly Val Arg Asp Arq Asn Glv 

850 855 860 



Leu Tyr Thr Tyr Leu Leu He Asp Asn Gin Val Ser Ala Aso Val He 

45 865 870 875 880 

Thr Ser Arg He Ala Glu Ala He Ala Gly He Gin Leu Tyr Val Asn 
885 890 895 

50 Arg Ala Leu Asn Arg Asp Glu Gly Gin Leu Ala Ser Asp Val Ser Thr 
900 905 910 



Arg Gin Phe Phe Thr Asp Trp Glu Arg Tyr Asn Lys Arq Tyr Ser Thr 
915 920 925 

Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr Val Asp 
930 935 940 



Pro Thr Gin Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin 
60 945 950 955 960 

Ser He Asn Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Aso Ala Phe 
965 970 975 

65 Lys Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He 

980 985 990 



Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Gly Leu Thr Tyr Phe 
995 1000 1005 

He Gly lie Asp Gin Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Vai 
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1010 1015 . 1020 

Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 
1025 1030 1035 1040 

5 

Glu Trp Asn Lys lie Thr Cys Ala Val Asn Pro Trp Lys Asn lie lie 
1045 1050 1055 

Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 
10 1060 1065 1070 

Gin Ser Lys Lys Ser Asp Asd Gly Lys Thr Thr He Tyr Gin Tyr Asn 
1075 1080 1085 

15 Leu Lys Leu Ala His He Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 
1090 1095 1100 



20 



35 



50 



65 



Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 
1105 * 1110 1115 1120 

Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 
1125 1130 1135 



Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 
25 1140 1145 1150 

Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr He Phe Ala Asp Met 
1155 1160 1165 

30 Ser Ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn Asn 
1170 1175 1180 



Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 
1185 1190 1195 1200 

Lys Lys Val He Thr Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 
1205 1210 1215 



Glu He Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp 
40 1220 1225 1230 

His Ser Leu Thr Met Leu Tyr Gly Gly Ser Val Pro Asn He Thr Phe 
1235 1240 1245 

4 5 Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 
1250 1255 1260 



He He His Asn Gly Tyr Ala Gly Thr Arg Arg lie Gin Cys Asn Leu 
1265 1270 " 1275 12B0 

Met Lys Gin Tyr Ala Ser Leu Gly Asp Lys Phe He lie Tyr Asp Ser 
1285 1290 1295 



Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 
55 1300 1305 1310 

Gly Lys Asp Glu Asn Ser Asp Asp Ser He Cys lie Tyr Asn Glu Asn 
1315 1320 1325 

60 Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Aso Asp Asn 
1330 1335 1340 



Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys lie Asp Ala Gly Thr 
1345 1350 1355 1360 

Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu He Glu Val He Ser 
1365 1370 1375 



Val Thr Gly Gly Tyr Trp Ser Ser Tvr Lvs He Ser Asn Pro He Asn 
70 1380 1385 1390 
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He Asn Thr G:^fc.e Asp Ser Ala Lys Val Lys Val^^ Val Lys Ala 

1395 1400 1405 

Gly Gly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 

5 1410 1415 1420 

Gin Gin Pro Ala Pro Ser Phe Glu Glu Met He Tyr Gin Phe Asn Asn 
1425 1430 1435 1440 

10 Leu Thr He Asp Cys Lys Asn Leu Asn Phe He Asp Asn Gin Ala His 

1445 1450 1455 



15 



30 



45 



60 



He Giu He Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Gly 
1460 1465 14*70 

Ala Glu Thr Phe He He Pro Val Thr Lys Lys Val Leu Gly Thr Glu 
1475 1480 ' 1485 



Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 
20 1490 1495 1500 

lie Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 
1505 1510 1515 1520 

25 Val Ser Arg Ala Asn Arg Gly lie Asp Ala Val Leu Ser Met Glu Thr 

1525 1530 1535 



Gin Asn lie Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 
1540 1545 1550 

Val Leu Asp Lys Tyr Asp Glu Ser lie His Gly Thr Asn Lys Ser Phe 
1555 1560 1565 



Ala He Glu Tyr Val Asp lie Phe Lys Glu Asn Asp Ser Phe Val He 
35 1570 1575 1580 

Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 
1585 1590 1595 1600 

40 Leu Ser Tyr Phe lie Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val 

1605 1610 1615 



Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys He Leu Phe Asp Ara 
1620 1625 1630 

Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 

1635 1640 1645 



Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 
50 1650 1655 1660 

Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 
1665 1670 1675 1680 

55 Tyr Tyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Glu Gin Asn 

1685 1690 1695 



Phe Asd Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Glv 
1700 1705 * 1710 

Tyr lie Val Asp Gly Lys He Ala He Tyr His Trp Asn Val Arg Pro 
1715 1720 1725 



Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 
65 1730 1735 1740 

Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 
1745 1750 1755 ~ 1760 

70 Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr 

1765 1770 1775 
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Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp Tyr Thr 
1780 1785 1790 

Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 

1795 1800 1805 

Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 
1810 1815 1820 

Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 
1825 1830 1835 1840 



Lys Thr Pro Leu 
15 1844 



(2) INFORMATION FOR SEQ ID NO: 54: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1722 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 (tcbAm coding 
region) : 

30 CTA GGA ACA GCC AAT TCC CTG ACC GCT TTA TTC CTG CCG CAG GAA AAT 4 8 

Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn 
15 10 15 

AGC AAG CTC AAA GGC TAC TGG CGG ACA CTG GCG CAG CGT ATG TTT AAT 96 
35 Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn 
20 ~ 25 30 

TTA CGT CAT AAT CTG TCG ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG 14 4 
Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Pro Leu 
40 35 40 45 

TAT GCT AAA CCG GCT GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA 192 
Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 
50 55 60 



45 



65 



GCT TCT CAA GGG GGA GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC 24 0 
Ala S_e_r_Gln Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr He His 

65 — ' "**" 10 75 80 



50 CGC TTC CCT CAA ATG CTA GAA GGG GCA CGG GGC TTG GTT AAC CAG CTT 28 8 
Arg Phe Pro Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu 
8 5 90 95 

ATA CAG TTC GGT AGT TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG 33 6 
55 He Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 
100 105 110 

GAA GCT ATG AGT CAA CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG 38 4 
Glu Ala Met Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu 
60 115 120 125 

ACC AGT ATT CGT ATG CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA 4 32 
Thr Ser He Arg Met Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu 
130 135 140 



AAA ACC GCC TTG CAA GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC 4 80 
Lys Thr Ma Leu Gin Val Ser Leu Ala Giy Val Gin Gin Arg Phe Asp 
145 150 155 160 
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AGC TAT AGC C^^G TAT GAG GAG AAC ATC AAC GCA^^ GAG CAG CGA 528 

Ser Tyr Ser Gin Leu Tyr Glu Glu Asn lie Asn Ala Gly Glu Gin Arg 
165 170 175 

5 GCG CTG GCG TTA CGC TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG 57 6 
Ala Leu Ala Leu Arg Ser Glu Ser Ala lie Glu Ser Gin Gly Ala Gin 
180 185 190 

ATT TCC CGT ATG GCA GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC 624 
10 lie Ser Arg Met Ala Gly Ala Gly Val Asp Met Ala Pro Asn lie Phe 
195 200 205 

GGC CTG GCT GAT GGC GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC 67 2 
Gly Leu Ala Asp Gly Gly Met His Tyr Gly Ala He Ala Tyr Ala He 
15 210 215 220 

GCT GAC GGT ATT GAG TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG 720 
Ala Asp Gly He Glu Leu Ser Ala Ser Ala Lys Met Val Asd Ala Glu 
225 230 235 240 

AAA GTT GCT CAG TCG GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA 7 68 
Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys 
245 250 ' 255 



20 



2 5 ATT CAG CGT GAC AAC GCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA 816 

He Gin Arg Asp Asn Ala Gin Ala Glu lie Asn Gin Leu Asn Ala Gin 

260 265 270 

CTG GAA TCA CTG TCT ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG 8 64 

30 Leu Glu Ser Leu Ser lie Arg Arg Glu Ala Ala Glu Met Gin Lys Glu 

275 280 285 

TAC CTG AAA ACC CAG CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA 912 

Tyr Leu Lys Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu 
35 290 ~ 295 300 

AGA AGC AAA TTC AGT AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT 960 

Arg Ser Lys Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg 
305 310 315 320 



40 



60 



TTG TCA GGT ATT TAT TTC CAG TTC TAT GAC TTG GCC GTA TCA CGT TGC 1008 
Leu Ser Gly lie Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys 
325 330 335 



4 5 CTG ATG GCA GAG CAA TCC TAT CAA TGG GAA GCT AAT GAT AAT TCC ATT 105 6 
Leu Met Ala Glu Gin Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He 
340 345 350 

AGC TTT GTC AAA CCG GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG 1104 
50 Ser Phe Val Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 
355 360 365 

TGT GGA GAA GCT TTG ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT 1152 
Cys Gly Glu Ala Leu He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr 
55 370 375 380 

CTG AAA TGG GAA TCT CGC GCT TTG GAA GTA GAA CGC ACG GTT TCA TTG 1200 
Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 
385 390 395 " 400 



GCA GTG GTT TAT GAT TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG 12 4 8 
Ala Val Val Tyr Asp Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala 
405 410 415 



65 GAA CAA ATA CCT GCA TTA TTG GAT AAG GGG GAG GGA ACA GCA GGA ACT 12 96 
Glu Gin lie Pro Ala Leu Leu Asp Lys Gly Glu Giy Thr Ala Gly Thr 
420 425 430 

AAA GAA AAT GGG TTA TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC 134 4 
70 Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val 
435 440 445 
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G AAA CTG GGA ACG GAT TAT CCA GAC ACT A' 



AAA TTG TCC GAC TTG AAA CTG GGA ACG GAT TAT CCA GAC ACT ATC GTT 13 92 

Lys Leu Ser Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser lie Val 

4 50 * 4 55 4 60 

5 

GGT AGC AAC AAG GTT CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT 14 40 

Gly Ser Asn Lys Val Arg Arg lie Lys Gin lie Ser Val Ser Leu Pro 
465 470 475 480 

10 GCA TTG GTT GGG CCT TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT 14 88 
Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly - 
485 490 495 

GGC AGT ACT CAA TTG CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT 15 36 
15 Gly Ser Thr Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His 
500 505 510 

GGT ACC AAT GAT AGT GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA 15 84 
Gly Thr Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys 
20 515 520 525 

TAC CTG CCA TTT GAA GGT ATT GCT CTT GAT GAT CAG GGT ACA CTG AAT 1632 

Tyr Leu Pro Phe Glu Gly lie Ala Leu Asp Asp Gin Gly Thr Leu Asn 
530 535 540 

25 

CTT CAA TTT CCG AAT GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT 168 0 

Leu Gin Phe Pro Asn Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr 

545 550 555 560 

30 ATG AGC GAT ATT ATT TTG CAT ATT CGT TAT ACC ATC CGT TAA 1722 
Met Ser Asp He He Leu His He Arg Tyr Thr He Arg ••• 
565 570 573 

35 (2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 amino acids 

(B) TYPE: amino acids 

40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 



45 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55 (TcbAiii): 

Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn 
15 10 15 



Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn 
50 20 25 30 

Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu 

35 40 45 

5 5 Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 
50 55 60 



Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr He His 
65 70 75 80 

Arg Phe Pro Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu 
8 5 90 95 



lie Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 
65 100 105 110 

Glu Ala Met Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu lie Leu 
115 120 125 
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Thr Ser He Ar|^Bt Gin Asp Asn Gin Leu Ala Glu Asd Ser Glu 

130 135 ' 140 

Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp 

5 145 150 155 160 

Ser Tyr Ser Gin Leu Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Ara 
165 170 175 

10 Ala Leu Ala Leu Arg Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin 
180 185 190 



15 



30 



45 



60 



He Ser Arg Met Ala Gly Ala Gly Val Asp Met Ala Pro Asn He Phe 

195 200 205 

Gly Leu Ala Asp Gly Gly Met His Tvr Gly Ala He Ala Tyr Ala He 
210 215 220 



Ala Asp Gly He Glu Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu 

20 225 230 235 240 

Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys 

245 250 255 

25 He Gin Arg Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin 

260 265 270 



Leu Glu Ser Leu Ser lie Arg Arg Glu Ala Ala Glu Met Gin Lys Glu 

275 280 285 

Tyr Leu Lys Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu 

290 295 300 



Arg Ser Lys Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg 

35 305 310 315 320 

Leu Ser Gly He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys 

325 330 335 

4 0 Leu Met Ala Glu Gin Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He 

340 345 350 



Ser Phe Val Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 

355 360 365 

Cys Gly Glu Ala Leu lie Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr 

370 375 380 



Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 

50 385 390 395 400 

Ala Val Val Tyr Asp Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala 

405 410 415 

55 Glu Gin He Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr 

420 425 430 



Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val 

435 440 445 

Lys Leu Ser Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser lie Val 

450 455 460 



Gly Ser Asn Lys Val Arq Arg lie Lys Gin lie Ser Val Ser Leu Pro 
65 465 470 475 480 

Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly 
485 490 495 

70 Gly Ser Thr Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His 
500 505 510 
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10 



25 



40 



45 



60 



65 



Gin Phe Gin Leu' Asp Phe Asn ^^^G. 



Gly Thr Asn Asd Ser nly Gin Phe Gin Leu' Asp Phe Asn AsVGly Lys 

515 520 525 

Tyr Leu Pro Phe Glu Gly lie Ala Leu Asp Asp Gin Gly Thr Leu Asn 

530 535 540 

Leu Gin Phe Pro Asn Ala Thr Asp Lys Gin Lys Ala lie Leu Gin Thr 
545 550 555 560 

Met Scr Asp lie lie Leu His lie Arg Tyr Thr He Arg ••• 
565 570 573 



15 (2) INFORMATION FOR SEQ ID NO: 56 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 994 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 56 itCCA) 

1 ATG AAT CAA CTC GCC AGT CCC CTG ATT TCC CGC ACC GAA GAG ATC CAC 4 8 

1 Met Asn Gin Leu Ala Ser Pro Leu He Ser Arg Thr Glu Glu He His 16 



30 4 9 AAC TTA CCC GGT AAA TTG ACC GAT CTT GGT TAT ACC TCA GTG TTT GAT 96 

17 Asn Leu Pro Gly Lys Leu Thr Asp Leu Gly Tyr Thr Ser Val Phe Asp 32 

97 GTG GTA CGT ATG CCG CGT GAG CGT TTT ATT CGT GAG CAT CGT GCT GAT 144 

35 33 Val Val Arg Met Pro Arg Glu Arg Phe He Arg Glu His Arg Ala Asp 48 



14 5 CTC GGG CGC AGT GCT GAA AAA ATG TAT GAC CTG GCA GTG GGC TAT GCT 192 

49 Leu Gly Arg Ser Ala Glu Lys Met Tyr Asp Leu Ala Val Gly Tyr Ala 64 

193 CAT CAG GTG TTA CAC CAT TTT CGC CGT AAT TCT CTT AGT GAA GCT GTT 24 0 

65 His Gin Val Leu His His Phe Arg Arg Asn Ser Leu Ser Glu Ala Val 80 

241 CAG TTT GGC TTG AGA AGT CCG TTC TCC GTA TCA GGC CCG GAT TAC GCC 2 88 

81 Gin Phe Gly Leu Arg Ser Pro Phe Ser Val Ser Gly Pro Asp Tyr Ala 96 



50 289 AAT CAG TTT CTT GAT GCA AAC ACG GGT TGG AAA GAT AAA GCA CCA AGT 33 6 

97 Asn Gin Phe Leu Asp Ala Asn Thr Gly Trp Lys Asp Lys Ala Pro Ser 112 

3 37 GGA TCA CCG GAA GCC AAT GAT GCG CCG GTA GCC TAT CTG ACT CAT ATT 3 84 

55 113 Gly Ser Pro Glu Ala Asn Asp Ala Pro Val Ala Tyr Leu Thr His He 128 



3 85 TAT CAA TTG GCC CTT GAA CAG GAA AAG AAT GGC GCC ACT ACC ATT ATG 43 2 
129 Tyr Gin Leu Ala Leu Glu Gin Glu Lys Asn Gly Ala Thr Thr He Met 144 

4 33 AAT ACG CTG GCG GAG CGT CGC CCC GAT CTG GGT GCT TTG TTA ATT AAT 4 80 
145 Asn Thr Leu Ala Glu Arg Arg Pro Asp Leu Gly Ala Leu Leu He Asn 160 

4 81 GAT AAA GCA ATC AAT GAG GTG ATA CCG CAA TTG CAG TTG GTC AAT GAA 52 8 

161 Asp Lys Ala He Asn Glu Val He Pro Gin Leu Gin Leu Val Asn Glu 176 
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52 9 ATT CTG I^^AAA GCT ATT CAG AAG AAA CTG AOWFiG ACT GAT CTG GAA 576 

177 He Leu Ser Lys Ala He Gin Lys Lys Leu Ser Leu Thr Asp Leu Glu 192 

5 577 GCG GTA AAC GCC AGA CTT TCC ACT ACC CGT TAC CCG AAT AAT CTG CCG 624 

193 Ala Val Asn Ala Arg Leu Ser Thr Thr Arg Tyr Pro Asn Asn Leu Pro 208 

625 TAT CAT TAT GGT CAT CAG CAG ATT CAG ACA GCT CAA TCG GTA TTG GGT 672 

10 209 Tyr His Tyr Gly His Gin Gin He Gin Thr Ala Gin Ser Val Leu Gly 224 



15 



20 



35 



40 



45 



50 



55 



60 



65 



70 



6 73 ACT ACG TTG CAA GAT ATC ACT TTG CCA CAG ACG CTG GAT CTG CCG CAA 72 0 

225 Thr Thr Leu Gin Asp He Thr Leu Pro Gin Thr Leu Asp Leu Pro Gin 24 0 

721 AAC TTC TGG GCA ACA GCA AAA GGA AAA CTG AGC GAT ACG ACT GCC AGT 76 8 

241 Asn Phe Trp Ala Thr Ala Lys Gly Lys Leu Ser Asp Thr Thr Ala Ser 256 

769 GCT TTG ACC CGA CTG CAA ATC ATG GCG AGT CAG TTT TCG CCA GAG CAG 816 

257 Ala Leu Thr Arg Leu Gin He Met Ala Ser Gin Phe Ser Pro Glu Gin 272 



2 5 817 CAG AAA ATC ATT ACG GAG ACT GTC GGT CAG GAT TTC TAT CAG CTT AAC 8 64 

273 Gin Lys He lie Thr Glu Thr Val Gly Gin Asp Phe Tyr Gin Leu Asn 288 

865 TAT GGT GAC AGT TCG CTT ACT GTG AAT AGT TTC AGC GAC ATG ACC ATA 912 

30 289 Tyr Gly Asp Ser Ser Leu Thr Val Asn Ser Phe Ser Asp Met Thr He 304 



913 ATG ACT GAT CGA ACA AGT TTG ACT GTA CCC CAG GTA GAA CTG ATG TTG 96 0 
305 Met Thr Asp Arg Thr Ser Leu Thr Val Pro Gin Val Glu Leu Met Leu 320 

961 TGT TCA ACT GTC GGA GGT TCT ACG GTT GTT AAG TCT GAT AAT GTG AGT 
1008 

321 Cys Ser Thr Val Gly Gly Ser Thr Val Val Lys Ser Asp Asn Val Ser 336 

100 9 TCT GGT GAC ACG ACA GCG ACG CCA TTT GCG TAT GGC GCC CGC TTT ATT 
1056 

337 Ser Gly Asp Thr Thr Ala Thr Pro Phe Ala Tyr Gly Ala Arg Phe He 352 

105 7 CAT GCC GGT AAG CCG GAG GCG ATT ACC CTG AGT CGC AGT GGT GCG GAG 
1104 

353 His Ala Gly Lys Pro Glu Ala He Thr Leu Ser Arg Ser Gly Ala Glu 368 

1105 GCG CAT TTT GCT CTG ACG GTT AAC AAT CTG ACA GAT GAC AAG TTG GAC 
1152 

369 Ala His Phe Ala Leu Thr Val Asn Asn Leu Thr Asp Asp Lys Leu Asp 3 84 

1153 CGT ATT AAC CGC ACA GTG CGC CTG CAA AAA TGG CTG AAT CTG CCT TAT 
1200 

3 85 Arg He Asn Arg Thr Val Arg Leu Gin Lys Trp Leu Asn Leu Pro Tyr 4 00 

12 01 GAG GAT ATT GAC CTG TTA GTG ACT TCT GCT ATG GAT GCG GAA ACA GGA 
1248 

401 Glu Asp He Asp Leu Leu Val Thr Ser Ala Met Asp Ala Glu Thr Gly 416 

12 4 9 AAT ACC GCG CTG TCG ATG AAC GAC AAT ACG CTG CGT ATG TTG GGA GTG 
1296 

4 17 Asn Thr Ala Leu Ser Met Asn Asp Asn Thr Leu Arg Met Leu Gly Val 432 
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15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



12 97 TTC AAA CAT TflJ^kc GCG AAG TAT GGT GTT AGC GCT CAA TTT GCT 

1344 

433 Phe Lys His Tyr Gin Ala Lys Tyr Gly Val Ser Ala Lys Gin Phe Ala 44B 

134 5 GGC TGG CTG CGC GTA GTG GCC CCG TTT GCC ATT ACA CCG GCA ACG CCG 
1392 

44 9 Gly Trp Leu Arg Val Val Ala Pro Phe Ala He Thr Pro Ala Thr Pro 464 

13 93 TTT TTA GAC CAA GTG TTT AAC TCC GTC GGC ACC TTT GAT ACA CCG TTT 
1440 

465 Phe Leu Asp Gin Val Phe Asn Ser Val Gly Thr Phe Asp Thr Pro Phe 4 80 

1441 GTG ATA GAT AAT CAG GAT TTT GTC TAT ACA TTG ACC ACC GGG GGC GAT 
1488 

481 Val He Asp Asn Gin Asp Phe Val Tyr Thr Leu Thr Thr Gly Gly Asp 4 96 

14 8 9 GGG GCG CGT GTT AAG CAT ATC AGC ACG GCA CTG GGC CTC AAT CAT CGT 
1536 

497 Gly Ala Arg Val Lys His lie Ser Thr Ala Leu Gly Leu Asn His Arg 512 

1537 CAG TTC CTG TTA TTG GCG GAT AAT ATT GCC CGT CAA CAG GGG AAT GTC 
1584 

513 Gin Phe Leu Leu Leu Ala Asp Asn He Ala Arg Gin Gin Gly Asn Val 528 

15 8 5 ACG CAA AGC ACA CTC AAC TGT AAT CTG TTT GTG GTG TCA GCT TTC TAC 
1632 

529 Thr Gin Ser Thr Leu Asn Cys Asn Leu Phe Val Val Ser Ala Phe Tyr 544 

163 3 CGT CTG GCT AAT TTG GCG CGC ACA TTG GGG ATA AAT CCA GAG TCT TTC 
1680 

545 Arg Leu Ala Asn Leu Ala Arg Thr Leu Gly He Asn Pro Glu Ser Phe 560 

16 Bl TGT GCC TTG GTT GAT CGA TTA GAT GCA GGT ACA GGC ATC GTC TGG CAG 
1728 

561 Cys Ala Leu Val Asp Arg Leu Asp Ala Gly Thr Gly He Val Trp Gin 576 

172 9 CAA TTG GCA GGG AAA CCC ACA ATC ACG GTA CCA CAA AAA GAT TCC CCG 
1776 

577 Gin Leu Ala Gly Lys Pro Thr He Thr Val Pro Gin Lys Asp Ser Pro 592 

1777 CTG GCG GCG GAT ATT CTG AGT TTG CTG CAA GCG CTA AGT GCG ATT GCT 
1824 

593 Leu Ala Ala Asp He Leu Ser Leu Leu Gin Ala Leu Ser Ala He Ala 608 

182 5 CAA TGG CAA CAA CAG CAC GAT TTA GAA TTT TCA GCA CTG CTT TTG CTG 
1872 

609 Gin Trp Gin Gin Gin His Asp Leu Glu Phe Ser Ala Leu Leu Leu Leu 624 

1873 TTG AGT GAC AAC CCT ATT TCT ACC TCG CAG GGC ACT GAC GAT CAA TTG 
1920 

625 Leu Ser Asp Asn Pro He Ser Thr Ser Gin Gly Thr Asp Asp Gin Leu 640 

1921 AAC TTT ATC CGT CAA GTG TGG CAG AAC CTA GGC AGT ACG TTT GTG GGT 
1968 

641 Asn Phe lie Arg Gin Val Trp Gin Asn Leu Gly Ser Thr Phe Val Gly 656 
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1969 GCA ACA TTG TCC CGC AGT GGG GCA CCA TTlTOTC GAT ACC AAC GGC 

2016 

657 Ala Thr Leu Leu Ser Arg Ser Gly Ala Pro Leu Val Asp Thr Asn Gly 672 

5 

2 017 CAC GCT ATT GAC TGG TTT GCT CTG CTC TCA GCA GGT AAT AGT CCG CTT 
2064 

673 His Ala lie Asp Trp Phe Ala Leu Leu Ser Ala Gly Asn Ser Pro Leu 688 

10 

2065 ATC GAT AAG GTT GGT CTG GTG ACT GAT GCT GGC ATA CAA AGT GTT ATA 
2112 

689 lie Asp Lys Val Gly Leu Val Thr Asp Ala Gly lie Gin Ser Val lie 704 

15 

2113 GCA ACG GTG GTC AAT ACA CAA AGC TTA TCT GAT GAA GAT AAG AAG CTG 
2160 

705 Ala Thr Val Val Asn Thr Gin Ser Leu Ser Asp Glu Asp Lys Lys Leu 720 

20 

2161 GCA ATC ACT ACT CTG ACT AAT ACG TTG AAT CAG GTA CAG AAA ACT CAA 
2208 

721 Ala He Thr Thr Leu Thr Asn Thr Leu Asn Gin Val Gin Lys Thr Gin 736 

25 

2 2 09 CAG GGC GTG GCC GTC AGT CTG TTG GCG CAG ACT CTG AAC GTG AGT CAG 
2256 

737 Gin Gly Val Ala Val Ser Leu Leu Ala Gin Thr Leu Asn Val Ser Gin 752 

30 

22 57 TCA CTG CCT GCG TTA TTG TTG CGC TGG AGT GGA CAA ACA ACC TAC CAG 
2304 

75 3 Ser Leu Pro Ala Leu Leu Leu Arg Trp Ser Gly Gin Thr Thr Tyr Gin 7 68 

35 

2 3 05 TGG TTG AGT GCG ACT TGG GCA TTG AAG GAT GCC GTT AAG ACT GCC GCC 
2352 

769 Trp Leu Ser Ala Thr Trp Ala Leu Lys Asp Ala Val Lys Thr Ala Ala 7 84 

40 

2 3 53 GAT ATT CCC GCT GAC TAT CTG CGT CAA TTA CGT GAA GTG GTA CGC CGC 
2400 

785 Asp He Pro Ala Asp Tyr Leu Arg Gin Leu Arg Glu Val Val Arg Arg 800 

45 

24 01 TCC TTG TTG ACC CAA CAA TTC ACG CTG AGT CCT GCA ATG GTG CAA ACC 
2448 

801 Ser Leu Leu Thr Gin Gin Phe Thr Leu Ser Pro Ala Met Val Gin Thr 816 

50 

244 9 TTG CTG GAC TAT CCA GCC TAT TTT GGC GCT TCC GCA GAA ACA GTG ACC 
2496 

817 Leu Leu Asp Tyr Pro Ala Tyr Phe Gly Ala Ser Ala Glu Thr Val Thr 832 

55 

24 97 GAT ATC AGT TTG TGG ATG CTT TAT ACC CTG AGC TGT TAT AGC GAT TTA 
2544 

83 3 Asp He Ser Leu Trp Met Leu Tyr Thr Leu Ser Cys Tyr Ser Asp Leu 84 8 

60 

2 54 5 TTG CTC CAA ATG GGT GAA GCT GGT GGT ACC GAA GAT GAT GTA CTG GCC 
2592 

849 Leu Leu Gin Met Gly Glu Ala Gly Gly Thr Glu Asp Asp Val Leu Ala 864 

65 

2 593 TAC TTA CGC ACA GCT AAT GCT ACC ACA CCG TTG AGC CAA TCT GAT GCT 
2640 

865 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Leu Ser Gin Ser Asp Ala 880 

70 
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55 
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65 



2 641 GCA CAG ACG T^^A ACG CTA TTG GGT TGG GAG GTT GAG TTG CAA 
2686 

881 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Glu Val Asn Glu Leu Gin 896 

2 68 9 GCC GCT TGG TCG GTA TTG GGC GGG ATT GCC AAA ACC ACA CCG CAA CTG 
2736 

897 Ala Ala Trp Ser Val Leu Gly Gly lie Ala Lys Thr Thr Pro Gin Leu 912 

2737 GAT GCG CTT CTG CGT TTG CAA CAG GCA CAG AAC CAA ACT GGT CTT GGC 
2784 

913 Asp Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn Gin Thr Gly Leu Gly 928 

2785 GTT ACA CAG CAA CAG CAA GGC TAT CTC CTG AGT CGT GAC AGT GAT TAT 
2832 

92 9 Val Thr Gin Gin Gin Gin Gly Tyr Leu Leu Ser Arg Asp Ser Asp Tyr 944 

2 83 3 ACC CTT TGG CAA AGC ACC GGT CAG GCG CTG GTG GCT GGC GTA TCC CAT 
2880 

94 5 Thr Leu Trp Gin Ser Thr Gly Gin Ala Leu Val Ala Gly Val Ser His 960 

2881 GTC AAG GGC AGT AAC TGA GCATGGCAGA GCTCACTACC TGAGTGGATT TGATTT 
2934 

961 Val Lys Gly Ser Asn End 965 

2 93 5 TTCCGTATGG C CTAATG AGG CTATTTCTAA ACCGCCATTT AAGTAAGGCA GATAATTATG 
2994 



3 5 (2) INFORMATION FOR SEQ ID NO: 57 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 965 amino acids 

(B) TYPE: amino acid 

4 0 (C) TOPOLOGY: linear 

<ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 (TccA peptide) 
Features From To Description 

4 5 1 10 SEQ ID NO: 8 



50 



1 


Met 


Asn 


Gin 


Leu 


Ala 


Ser 


Pro 


Leu 


lie 


Ser 


Arg 


Thr 


Glu 


Glu 


He 


His 


16 


17 


Asn 


Leu 


Pro 


Gly 


Lys 


Leu 


Thr 


Asp 


Leu 


Gly 


Tyr 


Thr 


Ser 


Val 


Phe 


Asp 


32 


33 


Val 


Val 


Arg 


Met 


Pro 


Arg 


Glu 


Arg 


Phe 


He 


Arg 


Glu 


His 


Arg 


Ala 


Asp 


4B 


49 


Leu 


Gly Arg 


Ser 


Ala 


Glu 


Lys 


Met 


Tyr Asp 


Leu 


Ala 


Val 


Gly Tyr 


Ala 


64 


65 


His 


Gin 


val 


Leu 


His 


His 


Phe 


Arg 


Arg 


Asn 


Ser 


Leu 


Ser 


Glu 


Ala 


Val 


80 


81 


Gin 


Phe 


Gly 


Leu 


Arg 


Ser 


Pro 


Phe 


Ser 


Val 


Ser 


Gly 


Pro 


Asp 


Tyr 


Ala 


96 


97 


Asn 


Gin 


Phe 


Leu 


Asp 


Ala 


Asn 


Thr 


Gly Trp 


Lys 


Asp 


Lys 


Ala 


Pro 


Ser 


112 


113 


Gly 


Ser 


Pro 


Glu 


Ala 


Asn 


Asp 


Ala 


Pro 


Val 


Ala 


Tyr 


Leu 


Thr 


His 


He 


128 


129 


Tyr 


Gin 


Leu 


Ala 


Leu 


Glu 


Gin 


Glu 


Lys 


Asn 


Gly 


Ala 


Thr 


Thr 


He 


Met 


144 


145 


Asn 


Thr 


Leu 


Ala 


Glu 


Arg 


Arg 


Pro 


Asp 


Leu 


Gly 


Ala 


Leu 


Leu 


He 


Asn 


160 


161 


Asp 


Lys 


Ala 


He 


Asn 


Glu 


Val 


He 


Pro 


Gin 


Leu 


Gin 


Leu 


Val 


Asn 


Glu 


176 


177 


He 


Leu 


Ser 


Lys 


Ala 


He 


Gin 


Lys 


Lys 


Leu 


Ser 


Leu 


Thr 


Asp 


Leu 


Glu 


192 
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193 


Ala 


Val 


Asn 


Ala 


Arg 


Leu 


Ser 


Thr Thr Arg Tyr 


Pro 


Asn 


Asn 


Leu 


Pro 


208 


209 


Tyr 


His 


Tyr 


Gly 


His 


Gin 


Gin 


He 


Gin 


Thr 


Ala 


Gin 


Ser 


Val 


Leu Gly 


224 


225 


Thr 


Thr 


Leu 


Gin 


Asp 


He 


Thr 


Leu 


Pro 


Gin 


Thr 


Leu 


Asp 


Leu 


Pro 


Gin 


240 


241 


Asn 


Phe 


Trp 


Ala 


Thr 


Ala 


Lys 


Gly Lys 


Leu 


Ser 


Asp 


Thr 


Thr 


Ala 


Ser 


256 


257 


Ala 


Leu 


Thr 


Arg 


Leu 


Gin 


He 


Met 


Ala 


Ser 


Gin 


Phe 


Ser 


Pro 


Glu 


Gin 


272 


273 


Gin 


Lys 


He 


He 


Thr 


Glu 


Thr 


Val 


Gly Gin Asp 


Phe 


Tyr 


Gin 


Leu 


Asn 


288 


289 


Tyr 


Gly Asp 


Ser 


Ser 


Leu 


Thr 


Val 


Asn 


Ser 


Phe 


Ser 


Asp 


Met 


Thr 


He 


304 


305 


Met 


Thr 


Asp 


Arg 


Thr 


Ser 


Leu 


Thr 


Val 


Pro 


Gin 


Val 


Glu 


Leu 


Met 


Leu 


320 


321 


Cys 


Ser 


Thr 


Val 


Gly Gly 


Ser 


Thr 


Val 


Val 


Lys 


Ser 


Asp 


Asn 


Val 


Ser 


336 


337 


Ser Gly Asp 


Thr 


Thr 


Ala 


Thr 


Pro 


Phe 


Ala 


Tyr Gly Ala Arg 


Phe 


He 


352 


353 


His 


Ala Gly Lys 


Pro 


Glu 


Ala 


He 


Thr 


Leu 


Ser 


Arg 


Ser Gly Ala 


Glu 


368 


369 


Ala 


His 


Phe 


Ala 


Leu 


Thr 


Val 


Asn 


Asn 


Leu 


Thr 


Asp 


Asp 


Lys 


Leu 


Asp 


3 84 


385 


Arg 


lie 


Asn 


Arg 


Thr 


Val 


Arg 


Leu 


Gin 


Lys 


Trp 


Leu 


Asn 


Leu 


Pro 


Tyr 


400 


401 


Glu 


Asp 


He 


Asp 


Leu 


Leu 


Val 


Thr 


Ser 


Ala 


Met 


Asp 


Ala 


Glu 


Thr 


Gly 


416 


417 


Asn 


Thr 


Ala 


Leu 


Ser 


Met 


Asn 


Asp 


Asn 


Thr 


Leu 


Arg 


Met 


Leu 


Gly 


Val 


432 


433 


Phe 


Lys 


His 


Tyr 


Gin 


Ala 


Lys 


Tyr Gly Val 


Ser 


Ala 


Lys 


Gin 


Phe 


Ala 


448 


449 


Gly 


Trp 


Leu 


Arg 


Val 


Val 


Ala 


Pro 


Phe 


Ala 


He 


Thr 


Pro 


Ala 


Thr 


Pro 


464 


465 


Phe 


Leu 


Asp 


Gin 


Val 


Phe 


Asn 


Ser 


Val 


Gly 


Thr 


Phe 


Asp 


Thr 


Pro 


Phe 


480 


481 


Val 


lie 


Asp 


Asn 


Gin Asp 


Phe 


Val 


Tyr 


Thr 


Leu 


Thr 


Thr 


Gly 


Gly 


Asp 


4 96 


497 


Gly 


Ala 


Arg 


Val 


Lys 


His 


He 


Ser 


Thr 


Ala 


Leu 


Gly 


Leu 


Asn 


His 


Arg 


512 


513 


Gin 


Phe 


Leu 


Leu 


Leu 


Ala 


Asp 


Asn 


He 


Ala 


Arg 


Gin 


Gin Gly 


Asn 


Val 


528 


529 


Thr 


Gin 


Ser 


Thr 


Leu 


Asn 


Cys 


Asn 


Leu 


Phe 


Val 


Val 


Ser 


Ala 


Phe 


Tyr 


544 


545 


Arg 


Leu 


Ala 


Asn 


Leu 


Ala 


Arg 


Thr 


Leu 


Gly 


He 


Asn 


Pro 


Glu 


Ser 


Pne 


560 


561 


Cys 


Ala 


Leu 


Val 


Asp Arg 


Leu 


Asp 


Ala 


Gly Thr 


Gly 


He 


Val 


Trp 


Gin 


576 


577 


Gin 


Leu 


Ala 


Gly 


Lys 


Pro 


Thr 


He 


Thr 


Val 


Pro 


Gin 


Lys 


Asp 


Ser 


Pro 


592 


593 


Leu 


Ala 


Ala 


Asp 


He 


Leu 


Ser 


Leu 


Leu 


Gin 


Ala 


Leu 


Ser 


Ala 


He 


Ala 


608 


609 


Gin 


Trp 


Gin 


Gin 


Gin 


His 


Asp 


Leu 


Glu 


Phe 


Ser 


Ala 


Leu 


Leu 


Leu 


Leu 


624 


625 


Leu 


Ser 


Asp 


Asn 


Pro 


He 


Ser 


Thr 


Ser 


Gin Gly 


Thr 


Asp 


Asp 


Gin 


Leu 


640 


641 


Asn 


Phe 


He 


Arg 


Gin 


Val 


Trp 


Gin 


Asn 


Leu 


Gly 


Ser 


Thr 


Phe 


val 


Gly 


656 


657 


Ala 


Thr 


Leu 


Leu 


Ser 


Arg 


Ser Gly Ala 


Pro 


Leu 


Val 


Asp 


Thr 


Asn Gly 


672 


673 


His 


Ala 


He 


Asp 


Trp 


Phe 


Ala 


Leu 


Leu 


Ser 


Ala 


Gly 


Asn 


Ser 


Pro 


Leu 


688 


689 


lie 


Asp 


Lys 


Val 


Gly 


Leu 


Val 


Thr Asp Ala Gly 


He 


Gin 


Ser 


Val 


He 


704 


705 


Ala 


Thr 


val 


Val 


Asn 


Thr 


Gin 


Ser 


Leu 


Ser 


Asp 


Glu 


Asp 


Lys 


Lys 


Leu 


720 


721 


Ala 


He 


Thr 


Thr 


Leu 


Thr 


Asn 


Thr 


Leu 


Asn 


Gin 


Val 


Gin 


Lys 


Thr 


Gin 


736 


737 


Gin 


Gly Val 


Ala 


Val 


Ser 


Leu 


Leu 


Ala 


Gin 


Thr 


Leu 


Asn 


Val 


Ser 


Gin 


752 
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753 


Ser 


Leu 


Pro 


A 


2U 


Leu 


Leu 


Arg 


Trp 


Ser 


Gly 


Gin 




Thr 


Tyr 


Gin 


768 




769 


Trp 


Leu 


Ser 


Ala 


Thr 


Trp 


Ala 


Leu 


Lys 


Asp 


Ala 


Val 


Lys 


Thr 


Ala 


Ala 


784 


5 


785 


Asp 


He 


Pro 


Ala 


Asp 


Tyr 


Leu 


Arg 


Gin 


Leu 


Arg 


Glu 


Val 


Val 


Arg 


Arg 


800 




801 


Ser 


Leu 


Leu 


Thr 


Gin 


Gin 


Phe 


Thr 


Leu 


Ser 


Pro 


Ala 


Met 


Val 


Gin 


Thr 


816 


10 


817 


Leu 


Leu 


Asp Tyr 


Pro 


Ala 


Tyr 


Phe 


Gly 


Ala 


Ser 


Ala 


Glu 


Thr 


Val 


Thr 


832 


833 


Asp 


He 


Ser 


Leu 


Trp 


Met 


Leu 


Tyr 


Thr 


Leu 


Ser 


Cys 


Tyr 


Ser 


Asp 


Leu 


848 




849 


Leu 


Leu 


Gin 


Met 


Gly 


Glu 


Ala 


Gly 


Gly 


Thr 


Glu 


Asp 


Asp 


Val 


Leu 


Ala 


864 


15 


865 


Tyr 


Leu 


Arg 


Thr 


Ala 


Asn 


Ala 


Thr 


Thr 


Pro 


Leu 


Ser 


Gin 


Ser 


Asp 


Ala 


880 




881 


Ala 


Gin 


Thr 


Leu 


Ala 


Thr 


Leu 


Leu 


Gly 


Trp 


Glu 


Val 


Asn 


Glu 


Leu 


Gin 


896 


20 


897 


Ala 


Ala 


Trp 


Ser 


Val 


Leu 


Gly 


Gly 


He 


Ala 


Lys 


Thr 


Thr 


Pro 


Gin 


Leu 


912 


913 


Asp 


Ala 


Leu 


Leu 


Arg 


Leu 


Gin 


Gin 


Ala 


Gin 


Asn 


Gin 


Thr 


Gly 


Leu 


Gly 


928 




929 


val 


Thr 


Gin 


Gin 


Gin 


Gin 


Gly 


Tyr 


Leu 


Leu 


Ser 


Arg 


Asp 


Ser 


Asp 


Tyr 


944 


25 


945 


Thr 


Leu 


Trp 


Gin 


Ser 


Thr 


Gly 


Gin 


Ala 


Leu 


Val 


Ala 


Gly 


Val 


Ser 


His 


960 




961 


Val 


Lys 


Gly 


Ser 


Asn 


965 























30 (2) INFORMATION FOR SEQ ID NO: 58 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4932 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: DNA (genomic) 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 ( tccB) 





a 
i 


ATG 
Met 


TTA 
Leu 


TCG 
Ser 


ACA 
Thr 


ATG 
Met 


GAA 
Glu 


AAA 

Lys 


CAA 
Gin 


CTG 
Leu 


AAT 
Asn 


GAA 
Glu 


TCC 
Ser 


CAG 
Gin 


CGT 
Arg 


GAT 
Asp 


GCG 
Ala 


48 

16 


45 


49 

17 


TTG 
Leu 


GTG 
Val 


ACT 
Thr 


GGC 

Gly 


TAT 
Tyr 


ATG 
Met 


AAT 
Asn 


TTT 
Phe 


GTG 
Val 


GCG 
Ala 


CCG 
Pro 


ACG 
Thr 


TTG 
Leu 


AAA 
Lys 


GGC 
Gly 


GTC 
Val 


96 
32 


50 


97 
33 


AGT 
Ser 


GGT 
Gly 


CAG 
Gin 


CCG 
Pro 


GTG 
Val 


ACG 
Thr 


GTG 
Val 


GAA 
Glu 


GAT 
Asp 


TTA 
Leu 


TAC 
Tyr 


GAA 
Glu 


TAT 
Tyr 


TTG 
Leu 


CTG 
Leu 


ATT 
He 


144 
4B 


55 


145 
49 


GAC 
Asp 


CCG 
Pro 


GAA 
Glu 


GTG 
Val 


GCT 
Ala 


GAT 
Asp 


GAG 
Glu 


GTT 
Val 


GAG 
Glu 


ACG 
Thr 


AGT 
Ser 


CGG 
Arg 


GTA 
Val 


GCA 
Ala 


CAA 
Gin 


GCG 
Ala 


192 
64 


193 
65 


ATT 
lie 


GCC 
Ala 


AGC 
Ser 


ATA 
He 


CAG 
Gin 


CAA 
Gin 


TAT 
Tyr 


ATG 
Met 


ACT 
Thr 


CGT 
Arg 


CTG 
Leu 


GTC 
Val 


AAC 
Asn 


GGC 
Gly 


TCT 
Ser 


GAA 
Glu 


240 
80 


60 


241 
81 


CCG 
Pro 


GGG 
Gly 


CGT 
Arg 


CAG 
Gin 


GCG 
Ala 


ATG 
Met 


GAG 
Glu 


CCT 
Pro 


TCT 
Ser 


ACA 
Thr 


GCT 
Ala 


AAC 
Asn 


GAA 
Glu 


TGG 
Trp 


CGT 
Arg 


GAT 
Asp 


288 
96 


• 


289 
97 


AAT 
Asn 


GAT 
Asp 


AAC 
Asn 


CAA 
Gin 


TAT 
Tyr 


GCT 
Ala 


ATC 
He 


TGG 
Trp 


GCT 
Ala 


GCG 
Ala 


GGG 
Gly 


GCT 
Ala 


GAG 
Glu 


GTT 
Val 


CGA 
Arg 


AAT 
Asn 


336 
112 


65 


337 
113 


TAC 
Tyr 


GCT 
Ala 


GAA 
Glu 


AAC 
Asn 


TAT 
Tyr 


ATT 

He 


TCA 

Ser 


ccc 

Pro 


ATC 

He 


ACC 
Thr 


CGG 
Arg 


CAG 
Gin 


GAA 
Glu 


AAA 
Lys 


AGC 
Ser 


CAT 
His 


384 
128 
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AG CTG GAG ACG ACT TTA AAT CA 



3 S3 TAT TTC T^pfcAG CTG GAG ACG ACT TTA AAT CA(3^^T CGA CTC GAT CCG 4 32 

12 9 Tyr Phe Ser Glu Leu Glu Thr Thr Leu Asn Gin Asn Arg Leu Asp Pro 144 

5 4 33 GAT CGT GTG CAG GAT GCT GTT TTG GCG TAT CTC AAT GAG TTT GAG GCA 480 

145 Asp Arg Val Gin Asp Ala Val Leu Ala Tyr Leu Asn Glu Phe Glu Ala 160 

4 81 GTG AGT AAT CTA TAT GTG CTC AGT GGT TAT ATT AAT CAG GAT AAA TTT 528 
10 161 Val Ser Asn Leu Tyr Val Leu Ser Gly Tyr lie Asn Gin Asp Lys Phe 176 



15 



20 



52 9 GAC CAA GCT ATC TAC TAC TTT ATT GGT CGC ACT ACC ACT AAA CCG TAT 57 6 

177 Asp Gin Ala lie Tyr Tyr Phe lie Gly Arg Thr Thr Thr Lys Pro Tyr 192 

577 CGC TAC TAC TGG CGT CAG ATG GAT TTG AGT AAG AAC CGT CAA GAT CCG 624 

193 Arg Tyr Tyr Trp Arg Gin Met Asp Leu Ser Lys Asn Arg Gin Asp Pro 208 

62 5 GCA GGG AAT CCG GTG ACG CCA AAT TGC TGG AAT GAT TGG CAG GAA ATC 672 

209 Ala Gly Asn Pro Val Thr Pro Asn Cys Trp Asn Asp Trp Gin Glu lie 224 

2 5 67 3 ACT TTG CCG CTG TCT GGT GAT ACG GTG CTG GAG CAT ACA GTT CGC CCG 72 0 

225 Thr Leu Pro Leu Ser Gly Asp Thr Val Leu Glu His Thr Val Arg Pro 240 

721 GTA TTT TAT AAT GAT CGA CTA TAT GTG GCT TGG GTT GAG CGT GAC CCG 768 

3 0 241 Val Phe Tyr Asn Asp Arg Leu Tyr Val Ala Trp Val Glu Arg Asp Pro 2 56 



35 



40 



76 9 GCA GTA CAG AAG GAT GCT GAC GGT AAA AAC ATC GGT AAA ACC CAT GCC 816 

257 Ala Val Gin Lys Asp Ala Asp Gly Lys Asn lie Gly Lys Thr His Ala 272 

817 TAC AAC ATA AAG TTT GGT TAT AAA CGT TAT GAT GAT ACT TGG ACA GCG 864 

273 Tyr Asn lie Lys Phe Gly Tyr Lys Arg Tyr Asp Asp Thr Trp Thr Ala 288 

865 CCG AAT ACG ACC ACG TTA ATG ACA CAA CAA GCA GGG GAA AGT TCA GAA 912 

289 Pro Asn Thr Thr Thr Leu Met Thr Gin Gin Ala Gly Glu Ser Ser Glu 304 



4 5 913 ACA CAG CGA TCC AGC CTG CTG ATT GAT GAA TCT AGC ACC ACA TTG CGC 960 

305 Thr Gin Arg Ser Ser Leu Leu lie Asp Glu Ser Ser Thr Thr Leu Arg 320 

961 CAA GTT AAT CTG TTG GCT ACC ACC GAT TTT AGT ATC GAT CCG ACG GAG 
50 1008 

3 21 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser lie Asp Pro Thr Glu 33 6 

100 9 GAA ACG GAC AGT AAC CCG TAT GGC CGC CTA ATG TTG GGG GTG TTT GTC 
55 1056 

3 37 Glu Thr Asp Ser Asn Pro Tyr Gly Arg Leu Met Leu Gly Val Phe Val 352 

10 57 CGT CAA TTT GAA GGT GAT GGG GCC AAT AGA AAA AAT AAA CCC GTT GTT 
60 1104 

3 53 Arg Gin Phe Glu Gly Asp Gly Ala Asn Arg Lys Asn Lys Pro Val Val 368 

1105 TAT GGT TAT CTC TAT TGT GAC TCA GCT TTC AAT CGT CAT GTT CTC AGG 
65 1152 

369 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 3 84 

1153 CCG TTA AGT AAG AAC TTT TTG TTC AGT ACT TAC CGT GAT GAA ACG GAT 
70 1200 

3 85 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg Asp Glu Thr Asp 40 0 
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12 01 GGT CAA AAC AGC TTG CAA TTT GCG GTA TAC GAT AAA AAG TAT GTA ATT 
1246 

5 401 Gly Gin Asn Ser Leu Gin Phe Ala Val Tyr Asp Lys Lys Tyr Val lie 416 

12 4 9 ACT AAG GTT GTT ACA GGT GCA ACG GAA GAT CCC GAA AAT ACA GGA TGG 
1296 

10 417 Thr Lys Val Val Thr Gly Ala Thr Glu Asp Pro Glu Asn Thr Gly Trp 432 

12 97 GTA AGT AAA GTT GAT GAC TTG AAA CAA GGC ACT ACT GGG GCC TAT GTG 
1344 

15 43 3 Val Ser Lys Val Asp Asp Leu Lys Gin Gly Thr Thr Gly Ala Tyr Val 44 8 

13 4 5 TAT ATC GAT CAA GAT GGC CTG ACG CTT CAT ATA CAA ACC ACA ACT AAT 
1392 

20 44 9 Tyr lie Asp Gin Asp Gly Leu Thr Leu His lie Gin Thr Thr Thr Asn 464 

13 93 GGG GAT TTT ATT AAC CGT CAT ACG TTT GGA TAT AAC GAT CTT GTA TAT 
1440 

2 5 465 Gly Asp Phe lie Asn Arg His Thr Phe Gly Tyr Asn Asp Leu Val Tyr 480 

14 41 GAT TCT AAG TCT GGT TAT GGT TTC ACG TGG TCA GGA AAT GAA GGT TTT 
1488 

30 481 Asp Ser Lys Ser Gly Tyr Gly Phe Thr Trp Ser Gly Asn Glu Gly Phe 496 

14 8 9 TAT CTG GAT TAC CAT GAT GGA AAT TAT TAC ACC TTT CAT AAT GCA ATA 
1536 

35 497 Tyr Leu Asp Tyr His Asp Gly Asn Tyr Tyr Thr Phe His Asn Ala lie 512 

15 37 ATC AAC TAC TAT CCG TCT GGA TAT GGT GGT GGA TCT GTT CCT AAT GGA 
1584 

4 0 513 lie Asn Tyr Tyr Pro Ser Gly Tyr Gly Gly Gly Ser Val Pro Asn Gly 528 

1585 ACG TGG GCG TTA GAG CAA AGG ATT AAT GAG GGA TGG GCT ATT GCT CCC 
1632 

4 5 529 Thr Trp Ala Leu Glu Gin Arg lie Asn Glu Gly Trp Ala lie Ala Pro 544 

163 3 CTG CTT GAT ACT CTC CAT ACT GTT ACT GTG AAG GGC AGT TAT ATC GCT 

1680— 

50 545 Leu Leu Asp Thr Leu His Thr Val Thr Val Lys Gly Ser Tyr lie Ala 560 

1681 TGG GAA GGG GAA ACA CCT ACC GGT TAT AAT CTG TAT ATT CCA GAT GGT 
1728 

55 561 Trp Glu Gly Glu Thr Pro Thr Gly Tyr Asn Leu Tyr He Pro Asp Gly 576 

172 9 ACC GTG TTG CTA GAT TGG TTT GAT AAA ATA AAT TTT GCT ATT GGT CTT 
1776 

60 577 Thr Val Leu Leu Asp Trp Phe Asp Lys He Asn Phe Ala He Gly Leu 592 

1777 AAT AAG CTT GAG TCT GTA TTT ACG TCG CCA GAT TGG CCA ACA CTA ACC 
1824 

65 593 Asn Lys Leu Glu Ser Val Phe Thr Ser Pro Asp Trp Pro Thr Leu Thr 608 

182 5 ACT ATC AAA AAT TTC AGT AAA ATC GCC GAT AAC CGC AAA TTC TAT CAG 
1872 

7 0 609 Thr He Lys Asn Phe Ser Lys lie Ala Asp Asn Arg Lys Phe Tyr Gin 624 
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187 3 GAA ATC AAT GCT GAG ACG GCG GAT GGA CGC AAC CTG TTT AAA CGT TAC 
1920 

625 Glu lie Asn Ala Glu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg Tyr 640 

1921 AGT ACT CAA ACT TTC GGA CTT ACC AGC GGT GCG ACT TAT TCT ACA ACT 
1968 

641 Ser Thr Gin Thr Phe Gly Leu Thr Ser Gly Ala Thr Tyr Ser Thr Thr 656 

196 9 TAT ACT TTG TCT GAG GCG GAT TTC TCC ACT GAT CCG GAC AAA AAC TAC 
2016 

657 Tyr Thr Leu Ser Glu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn Tyr 672 

2 017 CTA CAG GTT TGT TTG AAT GTC GTG TGG GAT CAT TAT GAC CGC CCG TCA 
2064 

673 Leu Gin Val Cys Leu Asn Val Val Trp Asp His Tyr Asp Arg Pro Ser 688 

2 065 GGG AAA AAA GGG GCT TAT TCT TGG GTC AGT AAG TGG TTT AAC GTC TAT 
2112 

689 Gly Lys Lys Gly Ala Tyr Ser Trp Val Ser Lys Trp Phe Asn Val Tyr 704 

2113 GTT GCG TTG CAA GAT AGC AAA GCT CCG GAT GCC ATT CCT CGA TTA GTT 
2160 

705 Val Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala lie Pro Arg Leu Val 720 

2161 TCC CGT TAC GAT AGT AAA CGT GGT CTG GTG CAA TAT CTG GAC TTC TGG 
2208 

721 Ser Arg Tyr Asp Ser Lys Arg Gly Leu Val Gin Tyr Leu Asp Phe Trp 736 

2209 ACC TCA TCA TTA CCC GCG AAA ACC CGT CTT AAC ACC ACC TTT GTG CGT 
2256 

737 Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Val Arg 752 

2257 ACT TTG ATT GAG AAG GCT AAT CTG GGG CTG GAT AGT TTG CTG GAT TAC 
2304 

753 Thr Leu lie Glu Lys Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp Tyr 768 

2 3 05 ACC TTG CAG GCA GAT CCT TCT CTG GAA GCA GAT TTA GTG ACT GAC GGC 
2352 

769 Thr Leu Gin Ala Asp Pro Ser Leu Glu Ala Asp Leu Val Thr Asp Gly 784 

23 53 AAA AGC GAA CCA ATG GAC TTT AAT GGT TCA AAC GGT CTC TAT TTC TGG 
2400 

785 Lys Ser Glu Pro Met Asp Phe Asn Gly Ser Asn Gly Leu Tyr Phe Trp 800 

24 01 GAA TTG TTC TTT CAC CTG CCG TTT TTG GTT GCT ACA CGC TTT GCC AAC 
2448 

801 Glu Leu Phe Phe His Leu Pro Phe Leu Val Ala Thr Arg Phe Ala Asn 816 

244 9 GAA CAG CAA TTT TCG CCG GCA CAA AAG AGT TTG CAT TAC ATC TTT GAC 
2496 

817 Glu Gin Gin Phe Ser Pro Ala Gin Lys Ser Leu His Tyr lie Phe Asp 832 

24 97 CCG GCG ATG AAA AAC AAG CCA CAC AAT GCC CCG GCT TAT TGG AAT GTA 
2544 

833 Pro Ala Met Lys Asn Lys Pro His Asn Aia Pro Ala Tyr Trp Asn Val 84 8 
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254 5 CGT CCG TTG GAA GGA AAC AGC GAT TTG TCA Cfl^^AT TTG GAC GAT 

2592 

64 9 Arg Pro Leu Val Glu Gly Asn Ser Asp Leu Ser Arg His Leu Asp Asp 864 

2593 TCT ATA GAC CCA GAT ACT CAA GCT TAT GCT CAT CCG GTG ATA TAC CAG 
2640 

665 Ser lie Asp Pro Asp Thr Gin Ala Tyr Ala His Pro Val lie Tyr Gin 880 

2641 AAA GCG GTG TTT ATT GCC TAT GTC AGT AAC CTG ATT GCT CAG GGA GAT 
2668 

881 Lys Ala Val Phe He Ala Tyr Val Ser Asn Leu He Ala Gin Gly Asp 896 

268 9 ATG TGG TAT CGC CAA TTG ACT CGT GAC GGT CTG ACT CAG GCC CGT GTC 
2736 

697 Met Trp Tyr Arg Gin Leu Thr Arg Asp Gly Leu Thr Gin Ala Arg Val 912 

2 73 7 TAT TAC AAT CTG GCC GCT GAA TTG CTA GGG CCT CGT CCG GAT GTA TCG 
2784 

913 Tyr Tyr Asn Leu Ala Ala Glu Leu Leu Gly Pro Arg Pro Asp Val Ser 928 

2 785 CTG AGT AGC ATT TGG ACG CCG CAA ACC CTG GAT ACC TTA GCA GCC GGG 
2832 

92 9 Leu Ser Ser He Trp Thr Pro Gin Thr Leu Asp Thr Leu Ala Ala Gly 944 

2 83 3 CAA AAA GCG GTT TTA CGT GAT TTT GAG CAC CAG TTG GCT AAT AGT GAT 
2880 

945 Gin Lys Ala Val Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 960 

2 881 ACC GCT TTA CCC GCA TTG CCG GGC CGC AAT GTC AGC TAC TTG AAA CTG 
2 92 8 

961 Thr Ala Leu Pro Ala Leu Pro Gly Arg Asn Val Ser Tyr Leu Lys Leu 976 

2 92 9 GCA GAT AAT GGC TAC TTT AAT GAA CCG CTC AAT GTT CTG ATG TTG TCT 
2976 

977 Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val Leu Met Leu Ser 992 

2 977 CAC TGG GAT ACG TTG GAT GCA CGG TTA TAC AAT CTG CGT CAT AAC CTG 
3024 

993 His Trp Asp Thr Leu Asp Ala Arg Leu Tyr Asn Leu Arg His Asn Leu 
1008 



3 02 5 ACC GTT GAT GGC AAG CCG CTT TCG CTG CCG CTG TAT GCT GCG CCT GTT 
3072 

1009 Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Ala Ala Pro Val 
55 1024 

3 07 3 GAT CCG GTA GCG TTG TTG GCT CAG CGT GCT CAG TCC GGC ACG TTG ACG 
3120 

60 1025 Asp Pro Val Ala Leu Leu Ala Gin Arg Ala Gin Ser Gly Thr Leu Thr 
1040 

3121 AAT GGC GTC AGT GGC GCC ATG TTG ACG GTG CCG CCA TAC CGT TTC AGC 
65 3168 

1041 Asn Gly Val Ser Gly Ala Met Leu Thr Val Pro Pro Tyr Arg Phe Ser 
1056 

7 0 316 9 GCT ATG TTG CCG CGA GCT TAC AGC GCC GTG GGT ACG TTG ACC AGT TTT 
3216 
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1057 Ala Met ^Pro Arg Ala Tyr Ser Ala Val Gl^Bfcr Leu Thr Ser Phe 

1072 

3217 GGT CAG AAC CTG CTT AGT TTG TTG GAA CGT AGC GAA CGA GCC TGT CAA 
3264 

1073 Gly Gin Asn Leu Leu Ser Leu Leu Glu Arg Ser Glu Arg Ala Cys Gin 
10 8 8 

32 65 GAA GAG TTG GCG CAA CAG CAA CTG TTG GAT ATG TCC AGC TAT GCC ATC 
3312 

1089 Glu Glu Leu Ala Gin Gin Gin Leu Leu Asp Met Ser Ser Tyr Ala lie 
1104 

3313 ACG TTG CAA CAA CAG GCG CTG GAT GGA TTG GCG GCA GAT CGT CTG GCG 
3360 

1105 Thr Leu Gin Gin Gin Ala Leu Asp Gly Leu Ala Ala Asp Arg Leu Ala 
20 1120 

3 361 CTG CTA GCT AGT CAG GCT ACG GCA CAA CAG CGT CAT GAC CAT TAT TAC 
34 OB 

2 5 1121 Leu Leu Ala Ser Gin Ala Thr Ala Gin Gin Arg His Asp His Tyr Tyr 
1136 

3 4 09 ACT CTG TAT CAG AAC AAC ATC TCC AGT GCG GAA CAA CTG GTG ATG GAC 
30 3456 

1137 Thr Leu Tyr Gin Asn Asn He Ser Ser Ala Glu Gin Leu Val Met Asp 
1152 

35 3457 ACC CAA ACG TCA GCA CAA TCC CTG ATT TCT TCT TCC ACT GGT GTA CAA 
3504 

1153 Thr Gin Thr Ser Ala Gin Ser Leu He Ser Ser Ser Thr Gly Val Gin 
1168 



40 
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3505 ACT GCC AGT GGG GCA CTG AAA GTG ATC CCG AAT ATC TTT GGT TTG GCT 
3552 

1184 ^ Gly LeU Ala 



3 553 GAT GGC GGC TCG CGC TAT GAA GGA GTA ACG GAA GCG ATT GCC ATC GGG 
3 6 00 

1185 ^ - Asp G ly, Gly Ser Arg Tyr Glu Gly Val Thr Glu Ala He Ala He Gly 
DU 1200 



3 601 TTA ATG GCT GCC GGA CAA GCC ACC AGC GTG GTG GCC GAG CGT CTG GCA 
3648 

1216 G1U Ar9 LeU Ala 



3 64 9 ACC ACG GAG AAT TAC CGC CGC CGC CGT GAA GAG TGG CAA ATC CAA TAC 
60 3696 

1217 Thr Thr Glu Asn Tyr Arg Arg Arg Arg Glu Glu Trp Gin He Gin Tyr 

1232 J 

65 3 697 CAG CAG GCA CAG TCT GAG GTC GAC GCA TTA CAG AAA CAG TTG GAT GCG 
3 74 4 

1233 Gin Gin Ala Gin Ser Glu Val Asp Ala Leu Gin Lys Gin Leu Asp Ala 
124 8 

70 
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3 74 5 CTG GCA GTG GAG AAA GCA GCT CAA ACT TCC C^^AA CAG GCG AAG 

3792 

124 9 Leu Ala Val Arg Glu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys 
1264 

5 

37 93 GCA CAG CAG GTA CAA ATT CGG ACC ATG CTG ACT TAC TTA ACT ACT CGT 
3840 

1265 Ala Gin Gin Val Gin lie Arg Thr Met Leu Thr Tyr Leu Thr Thr Arg 
10 1280 

3 841 TTC ACC CAG GCG ACT CTG TAC CAG TGG CTG AGT GGT CAA TTA TCC GCG 
3888 

15 12 81 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Ala 
1296 

3 88 9 TTG TAT TAT CAA GCG TAT GAT GCC GTG GTT GCT CTC TGC CTC TCC GCC 
20 3936 

12 97 Leu Tyr Tyr Gin Ala Tyr Asp Ala Val Val Ala Leu Cys Leu Ser Ala 
1312 

2 5 3 93 7 CAA GCT TGC TGG CAG TAT GAA TTG GGT GAT TAC GCT ACC ACT TTT ATC 
3984 

1313 Gin Ala Cys Trp Gin Tyr Glu Leu Gly Asp Tyr Ala Thr Thr Phe lie 
1328 

3 9B5 CAG ACC GGT ACC TGG AAC GAC CAT TAC CGT GGT TTG CAA GTG GGG GAG 
4032 

1329 Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Val Gly Glu 
1344 

4 03 3 ACA CTG CAA CTC AAT TTG CAT CAG ATG GAA GCG GCC TAT TTA GTT CGT 
4080 

134 5 Thr Leu Gin Leu Asn Leu His Gin Met Glu Ala Ala Tyr Leu Val Arg 
40 1360 

4 081 CAC GAA CGC CGT CTT AAT GTG ATC CGT ACT GTG TCG CTC AAA AGC CTA 
4128 

4 5 1361 His Glu Arg Arg Leu Asn Val lie Arg Thr Val Ser Leu Lys Ser Leu 
1376 

412 9 TTG GGT GAT GAT GGT TTT GGT AAG TTA AAA ACC GAA GGC AAA GTC GAC 
50 4176 

1377 Leu Gly Asp Asp Gly Phe Gly Lys Leu Lys Thr Glu Gly Lys Val Asp 
1392 

55 4 17 7 TTT CCA TTA AGC GAA AAG CTG TTT GAC AAC GAC TAT CCG GGG CAC TAT 
4224 

13 93 Phe Pro Leu Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His Tyr 



1408 

4 22 5 TTG CGC CAG ATT AAA ACT GTG TCA GTG ACG TTG CCG ACG TTA GTC GGG 
4272 

1409 Leu Arg Gin lie Lys Thr Val Ser Val Thr Leu Pro Thr Leu Val Gly 
1424 

42 7 3 CCG TAT CAA AAC GTG AAG GCA ACG CTC ACT CAG ACC AGC AGC AGT ATA 
4320 

1425 Pro Tyr Gin Asn Val Lys Ala Thr Leu Thr Gin Thr Ser Ser Ser lie 



70 1440 
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4 321 TTG TTA GCA GCA GAT ATC AAT GGT GTT AAA CGT CTC AAT GAT CCG ACA 
4368 

1441 Leu Leu Ala Ala Asp lie Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 
5 1456 

4 36 9 GGT AAA GAG GGT GAT GCG ACG CAT ATT GTC ACC AAT CTG CGT GCC AGC 
4416 

10 1457 Gly Lys Glu Gly Asp Ala Thr His He Val Thr Asn Leu Arg Ala Ser 
1472 

4417 CAG CAG GTG GCG CTC TCT TCT GGC ATT AAT GAT GCC GGT AGC TTT GAG 
15 4464 

1473 Gin Gin Val Ala Leu Ser Ser Gly He Asn Asp Ala Gly Ser Phe Glu 
1488 

2 0 44 65 TTG CGT TTG GAA GAT GAG CGC TAT CTA TCA TTT GAG GGG ACT GGA GCT 
4512 

148 9 Leu Arg Leu Glu Asp Glu Arg Tyr Leu Ser Phe Glu Gly Thr Gly Ala 
1504 

4 513 GTT TCC AAA TGG ACT CTT AAC TTC CCG CGT TCT GTG GAT GAG CAT ATT 
4560 

1505 Val Ser Lys Trp Thr Leu Asn Phe Pro Arg Ser Val Asp Glu His He 
1520 

4561 GAC GAT AAG ACA TTG AAA GCG GAT GAG ATG CAG GCC GCA CTG TTG GCG 
4608 

1521 Asp Asp Lys Thr Leu Lys Ala Asp Glu Met Gin Ala Ala Leu Leu Ala 
35 1536 

460 9 AAT ATG GAT GAT GTG CTG GTG CAG GTG CAT TAT ACC GCC TGC GAC GGC 
4656 

4 0 1537 Asn Met Asp Asp Val Leu Val Gin Val His Tyr Thr Ala Cys Asp Gly 
1552 

4 6 57 GGC GCC AGT TTC GCA AAC CAG GTC AAG AAA ACA CTC TCT TAA CATTAACTTT 4 70 8 
4 5 1553 Gly Ala Ser Phe Ala Asn Gin Val Lys Lys Thr Leu Ser End 1565 

4 709 TAACTAATCC CTCCCACTCT GTTCGCCAGA GTGGGAGAAG GTTTG TCATA TCTAAAATCA 4768 

4 77 0 ATCTTGCGAT CTTTCTCCAT TTCATTGGAA GGGAAGCTGT AAAACAAATA AGGAATATGA 4 828 

4 82 9 TATG 4932 

(2) INFORMATION FOR SEQ ID NO: 59 
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30 
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55 



(i) SEQUENCE CHARACTERISTICS : 
60 (A) LENGTH: 1565 amino acids 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

65 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 59 (TccB peptide) 

Features From To Description 



16 



1 Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp Ala 
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17 Leu Val Thr Gly Tyr Met Asn Phe Val Ala Pro Thr Leu Lys Gly Val 

32 

33 Ser Gly Gin Pro Val Thr Val Glu Asp Leu Tyr Glu Tyr Leu Leu lie 

4 8 

4 9 Asp Pro Glu Val Ala Asp Glu Val Glu Thr Ser Arg Val Ala Gin Ala 

64 

65 lie Ala Ser lie Gin Gin Tyr Met Thr Arg Leu Val Asn Gly Ser Glu 

80 



81 Pro Gly Arg Gin Ala Met Glu Pro Ser Thr Ala Asn Glu Trp Arg Asp 

25 96 

97 Asn Asp Asn Gin Tyr Ala lie Trp Ala Ala Gly Ala Glu Val Arg Asn 
112 

20 113 Tyr Ala Glu Asn Tyr lie Ser Pro lie Thr Arg Gin Glu Lys Ser His 
128 

12 9 Tyr Phe Ser Glu Leu Glu Thr Thr Leu Asn Gin Asn Arg Leu Asp Pro 
144 

25 

145 Asp Arg Val Gin Asp Ala Val Leu Ala Tyr Leu Asn Glu Phe Glu Ala 
160 

161 Val Ser Asn Leu Tyr Val Leu Ser Gly Tyr lie Asn Gin Asp Lys Phe 

30 176 

177 Asp Gin Ala lie Tyr Tyr Phe lie Gly Arg Thr Thr Thr Lys Pro Tyr 
192 

3 5 193 Arg Tyr Tyr Trp Arg Gin Met Asp Leu Ser Lys Asn Arg Gin Asp Pro 
208 



2 09 Ala Gly Asn Pro Val Thr Pro Asn Cys Trp Asn Asp Trp Gin Glu lie 
224 

225 Thr Leu Pro Leu Ser Gly Asp Thr Val Leu Glu His Thr Val Arg Pro 
240 



241 Val Phe Tyr Asn Asp Arg Leu Tyr Val Ala Trp Val Glu Arg Asp Pro 

4 5 256 

257 Ala Val Gin Lys Asp Ala Asp Gly Lys Asn lie Gly Lys Thr His Ala 
272 

50 273 Tyr Asn He Lys Phe Gly Tyr Lys Arg Tyr Asp Asp Thr Trp Thr Ala 
28B 



289 Pro Asn Thr Thr Thr Leu Met Thr Gin Gin Ala Gly Glu Ser Ser Glu 
304 

305 Thr Gin Arg Ser Ser Leu Leu He Asp Glu Ser Ser Thr Thr Leu Arg 
320 



321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser He Asp Pro Thr Glu 
60 336 

337 Glu Thr Asp Ser Asn Pro Tyr Gly Arg Leu Met Leu Gly Val Phe Val 
352 

65 353 Arg Gin Phe Glu Gly Asp Gly Ala Asn Arg Lys Asn Lys Pro Val Val 
368 



369 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 
3 84 
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385 Pro Leu SlWFLys Asn Phe Leu Phe Ser Thr Tyr^Trg Asp Glu Thr Asp 
400 

401 Gly Gin Asn Ser Leu Gin Phe Ala Val Tyr Asp Lys Lys Tyr Val lie 
5 416 

417 Thr Lys Val Val Thr Gly Ala Thr Glu Asp Pro Glu Asn Thr Gly Trp 
432 

10 433 Val Ser Lys Val Asp Asp Leu Lys Gin Gly Thr Thr Gly Ala Tyr Val 
448 
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60 



44 9 Tyr lie Asp Gin Asp Gly Leu Thr Leu His He Gin Thr Thr Thr Asn 
464 

465 Gly Asp Phe He Asn Arg His Thr Phe Gly Tyr Asn Asp Leu Val Tyr 
4 80 



4 81 Asp Ser Lys Ser Gly Tyr Gly Phe Thr Trp Ser Gly Asn Glu Gly Phe 
2 0 4 96 

497 Tyr Leu Asp Tyr His Asp Gly Asn Tyr Tyr Thr Phe His Asn Ala He 
512 

2 5 513 He Asn Tyr Tyr Pro Ser Gly Tyr Gly Gly Gly Ser Val Pro Asn Gly 
528 



529 Thr Trp Ala Leu Glu Gin Arg He Asn Glu Gly Trp Ala He Ala Pro 
544 

545 Leu Leu Asp Thr Leu His Thr Val Thr Val Lys Gly Ser Tyr He Ala 
560 



561 Trp Glu Gly Glu Thr Pro Thr Gly Tyr Asn Leu Tyr He Pro Asp Gly 

3 5 576 

577 Thr Val Leu Leu Asp Trp Phe Asp Lys He Asn Phe Ala He Gly Leu 
592 

4 0 5 93 Asn Lys Leu Glu Ser Val Phe Thr Ser Pro Asp Trp Pro Thr Leu Thr 

608 



609 Thr He Lys Asn Phe Ser Lys He Ala Asp Asn Arg Lys Phe Tyr Gin 
624 

625 Glu He Asn Ala Glu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg Tyr 
640 



641 Ser Thr Gin Thr Phe Gly Leu Thr Ser Gly Ala Thr Tyr Ser Thr Thr 
50 656 

657 Tyr Thr Leu Ser Glu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn Tyr 
672 

5 5 673 Leu Gin Val Cys Leu Asn Val Val Trp Asp His Tyr Asp Arg Pro Ser 
688 



689 Gly Lys Lys Gly Ala Tyr Ser Trp Val Ser Lys Trp Phe Asn Val Tyr 
7 04 

705 Val Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala He Pro Arg Leu Val 
720 



721 Ser Arg Tyr Asp Ser Lys Arg Gly Leu Val Gin Tyr Leu Asp Phe Trp 
65 736 

7 37 Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Val Arg 
752 

7 0 7 53 Thr Leu He Glu L* ^ Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp Tyr 
768 
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769 Thr Leu Gin Ala Asp Pro Ser Leu Glu Ala Asp Leu Val Thr Asp Gly 
784 

7 85 Lys Ser Glu Pro Met Asp Phe Asn Gly Ser Asn Gly Leu Tyr Phe Trp 
800 

8 01 Glu Leu Phe Phe His Leu Pro Phe Leu Val Ala Thr Arg Phe Ala Asn 
816 

817 Glu Gin Gin Phe Ser Pro Ala Gin Lys Ser Leu His Tyr He Phe Asp 
832 



83 3 Pro Ala Met Lys Asn Lys Pro His Asn Ala Pro Ala Tyr Trp Asn Val 
15 848 

84 9 Arg Pro Leu Val Glu Gly Asn Ser Asp Leu Ser Arg His Leu Asp Asp 
864 

20 865 Ser He Asp Pro Asp Thr Gin Ala Tyr Ala His Pro Val He Tyr Gin 
880 
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881 Lys Ala Val Phe He Ala Tyr Val Ser Asn Leu He Ala Gin Gly Asp 
896 

8 97 Met Trp Tyr Arg Gin Leu Thr Arg Asp Gly Leu Thr Gin Ala Arg Val 
912 



913 Tyr Tyr Asn Leu Ala Ala Glu Leu Leu Gly Pro Arg Pro Asp Val Ser 
30 928 

929 Leu Ser Ser He Trp Thr Pro Gin Thr Leu Asp Thr Leu Ala Ala Gly 
944 

35 94 5 Gin Lys Ala Val Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 
960 



961 Thr Ala Leu Pro Ala Leu Pro Gly Arg Asn Val Ser Tyr Leu Lys Leu 
976 

977 Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val Leu Met Leu Ser 
992 



993 His Trp Asp Thr Leu Asp Ala Arg Leu Tyr Asn Leu Arg His Asn Leu 
4 5 1008 

100 9 Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Ala Ala Pro Val 
1024 

50 1025 Asp Pro Val Ala Leu Leu Ala Gin Arg Ala Gin Ser Gly Thr Leu Thr 
1040 



1041 Asn Gly Val Ser Gly Ala Met Leu Thr Val Pro Pro Tyr Arg Phe Ser 
1056 

1057 Ala Met Leu Pro Arg Ala Tyr Ser Ala Val Gly Thr Leu Thr Ser Phe 
1072 



1073 Gly Gin Asn Leu Leu Ser Leu Leu Glu Arg Ser Glu Arg Ala Cys Gin 
60 108B 

1089 Glu Glu Leu Ala Gin Gin Gin Leu Leu Asp Met Ser Ser Tyr Ala He 
1104 

6 5 1105 Thr Leu Gin Gin Gin Ala Leu Asp Gly Leu Ala Ala Asp Arg Leu Ala 
1120 



1121 Leu Leu Ala Ser Gin Ala Thr Ala Gin Gin Arg His Asp His Tyr Tyr 
1136 
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1137 Thr Leu T^^Gln Asn Asn lie Ser Ser Ala Glu^fln Leu Val Met Asp 
1152 

1153 Thr Gin Thr Ser Ala Gin Ser Leu lie Ser Ser Ser Thr Gly Val Gin 
5 1168 

1169 Thr Ala Ser Gly Ala Leu Lys Val lie Pro Asn lie Phe Gly Leu Ala 
1184 

10 1185 Asp Gly Gly Ser Arg Tyr Glu Gly Val Thr Glu Ala He Ala He Gly 
1200 



15 



30 



45 



60 



1201 Leu Met Ala Ala Gly Gin Ala Thr Ser Val Val Ala Glu Arg Leu Ala 
1216 

1217 Thr Thr Glu Asn Tyr Arg Arg Arg Arg Glu Glu Trp Gin He Gin Tyr 
1232 



1233 Gin Gin Ala Gin Ser Glu Val Asp Ala Leu Gin Lys Gin Leu Asp Ala 
20 1248 

1249 Leu Ala Val Arg Glu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys 
1264 

2 5 1265 Ala Gin Gin Val Gin He Arg Thr Met Leu Thr Tyr Leu Thr Thr Arg 
1280 



1281 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Ala 
12 96 

1297 Leu Tyr Tyr Gin Ala Tyr Asp Ala Val Val Ala Leu Cys Leu Ser Ala 
1312 



1313 Gin Ala Cys Trp Gin Tyr Glu Leu Gly Asp Tyr Ala Thr Thr Phe He 
35 1328 

1329 Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Val Gly Glu 
1344 

4 0 1345 Thr Leu Gin Leu Asn Leu His Gin Met Glu Ala Ala Tyr Leu Val Arg 
1360 



1361 His Glu Arg Arg Leu Asn Val He Arg Thr Val Ser Leu Lys Ser Leu 
1376 

1377 Leu Gly Asp Asp Gly Phe Gly Lys Leu Lys Thr Glu Gly Lys Val Asp 
1392 



13 93 Phe Pro Leu Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His Tyr 
50 1408 

14 0 9 Leu Arg Gin He Lys Thr Val Ser Val Thr Leu Pro Thr Leu Val Gly 
1424 

55 1425 Pro Tyr Gin Asn Val Lys Ala Thr Leu Thr Gin Thr Ser Ser Ser He 
1440 



1441 Leu Leu Ala Ala Asp lie Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 
1456 

1457 Gly Lys Glu Gly Asp Ala Thr His He Val Thr Asn Leu Arg Ala Ser 
1472 



1473 Gin Gin Val Ala Leu Ser Ser Gly He Asn Asp Ala Gly Ser Phe Glu 
65 1488 

14 89 Leu Arg Leu Glu Asp Glu Arg Tyr Leu Ser Phe Glu Gly Thr Gly Ala 
1504 

7 0 1505 Val Ser Lys Trp Thr Leu Asn Phe Pro Arg Ser Val Asp Glu His lie 
1520 
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1521 Asp Asp Lys Thr Leu Lys Ala Asp Glu Met Gin Ala Ma Leu Leu Ala 
1536 

1537 Asn Met Asp Asp Val Leu Val Gin Val His Tyr Thr Ala Cys Asp Gly 
1552 

1553 Gly Ala Ser Phe Ala Asn Gin Val Lys Lys Thr Leu Ser 1565 



(2) INFORMATION FOR SEQ ID NO: 60 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3132 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 ( tCCC) 

1 ATG AGT CCG TCT GAG ACT ACT CTT TAT ACT CAA ACC CCA ACA GTC AGC 4 8 

1 Met Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Val Ser 16 



4 9 GTG TTA GAT AAT CGC GGT CTG TCC ATT CGT GAT ATT GGT TTT CAC CGT 96 
17 Val Leu Asp Asn Arg Gly Leu Ser lie Arg Asp lie Gly Phe His Arg 32 



97 ATT GTA ATC GGG GGG GAT ACT GAC ACC CGC GTC ACC CGT CAC CAG TAT 
144 

33 lie Val lie Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 48 



14 5 GAT GCC CGT GGA CAC CTG AAC TAC AGT ATT GAC CCA CGC TTG TAT GAT 
192 

49 Asp Ala Arg Gly His Leu Asn Tyr Ser lie Asp Pro Arg Leu Tyr Asp 64 



193 GCA AAG CAG GCT GAT AAC TCA GTA AAG CCT AAT TTT GTC TGG CAG CAT 
240 

65 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His 80 



241 GAT CTG GCC GGT CAT GCC CTG CGG ACA GAG AGT GTC GAT GCT GGT CGT 
288 

81 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 96 



289 ACT GTT GCA TTG AAT GAT ATT GAA GGT CGT TCG GTA ATG ACA ATG AAT 
336 

97 Thr Val Ala Leu Asn Asp lie Glu Gly Arg Ser Val Met Thr Met Asn 
112 



337 GCG ACC GGT GTT CGT CAG ACC CGT CGC TAT GAA GGC AAC ACC TTG CCC 
384 

113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro 
128 



385 GGT CGC TTG TTA TCT GTG AGC GAG CAA GTT TTC AAC CAA GAG AGT GCT 
432 

12 9 Gly Arg Leu Leu Ser Val Ser Glu Gin Val Phe Asn Gin Glu Ser Ala 
144 



43 3 AAA GTG ACA GAG CGC TTT ATC TGG GCT GGG AAT ACA ACC TCG GAG AAA 
480 



-281- 



SUBST1TUTE SHEET (RULE 26) 



3DOCID: <WO 9808932A 1_l_> 



WO 98/08932 



PCT/US97/07657 



145 
160 



Lys Val 



lu Arg Phe lie Trp Ala Gly As 



!r Thr Ser Glu Lys 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



4 81 GAG TAT AAC CTC TCC GGT CTG TGT ATA CGC CAC TAC GAC ACA GCG GGA 
528 

161 Glu Tyr Asn Leu Ser Gly Leu Cys He Arg His Tyr Asp Thr Ala Gly 
176 



52 9 GTG ACC CGG TTG ATG AGT CAG TCA CTG GCG GGC GCC ATG CTA TCC CAA 
576 

177 Val Thr Arg Leu Met Ser Gin Ser Leu Ala Gly Ala Met Leu Ser Gin 
192 



577 TCT CAC CAA TTG CTG GCG GAA GGG CAG GAG GCT AAC TGG AGC GGT GAC 
624 

193 Ser His Gin Leu Leu Ala Glu Gly Gin Glu Ala Asn Trp Ser Gly Asp 
208 



62 5 GAC GAA ACT GTC TGG CAG GGA ATG CTG GCA AGT GAG GTC TAT ACG ACA 
672 

209 Asp Glu Thr Val Trp Gin Gly Met Leu Ala Ser Glu Val Tyr Thr Thr 
224 



67 3 CAA AGT ACC ACT AAT GCC ATC GGG GCT TTA CTG ACC CAA ACC GAT GCG 
720 

225 Gin Ser Thr Thr Asn Ala He Gly Ala Leu Leu Thr Gin Thr Asp Ala 
240 



721 AAA GGC AAT ATT CAG CGT CTG GCT TAT GAC ATT GCC GGT CAG TTA AAA 
768 

241 Lys Gly Asn He Gin Arg Leu Ala Tyr Asp He Ala Gly Gin Leu Lys 
256 



76 9 GGG AGT TGG TTG ACG GTG AAA GGC CAG AGT GAA CAG GTG ATT GTT AAG 
816 

257 Gly Ser Trp Leu Thr Val Lys Gly Gin Ser Glu Gin Val He Val Lys 
272 



817 TCC CTG AGC TGG TCA GCC GCA GGT CAT AAA TTG CGT GAA GAG CAC GGT 
864 

273 Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Glu Glu His Gly 
288 



865 AAC GGC GTG GTT ACG GAG TAC AGT TAT GAG CCG GAA ACT CAA CGT CTG 
912 

28 9 Asn Gly Val Val Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gin Arg Leu 
3 04 



913 ATA GGT ATC ACC ACC CGG CGT GCC GAA GGG AGT CAA TCA GGA GCC AGA 
960 

305 He Gly He Thr Thr Arg Arg Ala Glu Gly Ser Gin Ser Gly Ala Arg 
320 



961 GTA TTG CAG GAT CTA CGC TAT AAG TAT GAT CCG GTG GGG AAT GTT ATC 
1008 

321 Val Leu Gin Asp Leu Arg Tyr Lys Tyr Asp Pro Val Gly Asn Val He 336 



100 9 AGT ATC CAT AAT GAT GCC GAA GCT ACC CGC TTT TGG CGT AAT CAG AAA 
1056 
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1057 GTG GAG CCG GAG AAT CGC TAT GTT TAT GAT TCT CTG TAT CAG CTT ATG 
1104 

353 Val Glu Pro Glu Asn Arg Tyr Val Tyr Asp Ser Leu Tyr Gin Leu Met 368 



1105 AGT GCG ACA GGG CGT GAA ATG GCT AAT ATC GGT CAG CAA AGC AAC CAA 
1152 

369 Ser Ala Thr Gly Arg Glu Met Ala Asn He Gly Gin Gin Ser Asn Gin 
384 



115 3 CTT CCC TCA CCC GTT ATA CCT GTT CCT ACT GAC GAC AGC ACT TAT ACC 
1200 

3 85 Leu Pro Ser Pro Val He Pro Val Pro Thr Asp Asp Ser Thr Tyr Thr 400 



12 01 AAT TAC CTT CGT ACC TAT ACT TAT GAC CGT GGC GGT AAT TTG GTT CAA 
1248 

4 01 Asn Tyr Leu Arg Thr Tyr Thr Tyr Asp Arg Gly Gly Asn Leu Val Gin 416 



124 9 ATC CGA CAC AGT TCA CCC GCG ACT CAA AAT AGT TAC ACC ACA GAT ATC 
1296 

417 He Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp He 432 



1297 ACC GTT TCA AGC CGC AGT AAC CGG GCG GTA TTG AGT ACA TTA ACG ACA 
1344 

433 Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 448 



134 5 GAT CCA ACC CGA GTG GAT GCG CTA TTT GAT TCC GGC GGT CAT CAG AAG 
1392 

44 9 Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Gly Gly His Gin Lys 464 



13 93 ATG TTA ATA CCG GGG CAA AAT CTG GAT TGG AAT ATT CGG GGT GAA TTG 
1440 

465 Met Leu He Pro Gly Gin Asn Leu Asp Trp Asn He Arg Gly Glu Leu 480 



1441 CAA CGA GTC ACA CCG GTG AGC CGT GAA AAT AGC AGT GAC AGT GAA TGG 
1488 

4 81 Gin Arg Val Thr Pro Val Ser Arg Glu Asn Ser Ser Asp Ser Glu Trp 4 96 

14 8 9"" TAT CGC TAT AGC AGT GAT GGC ATG CGG CTG CTA AAA GTG AGT GAA CAG 

1536 — 

4 97 Tyr Arg Tyr Ser Ser Asp Gly Met Arg Leu Leu Lys Val Ser Glu Gin 512 



1537 CAG ACG GGC AAC AGT ACT CAA GTA CAA CGG GTG ACT TAT CTG CCG GGA 
1584 

513 Gin Thr Gly Asn Ser Thr Gin Val Gin Arg Val Thr Tyr Leu Pro Gly 528 



15 85 TTA GAG CTA CGG ACA ACT GGG GTT GCA GAT AAA ACA ACC GAA GAT TTG 
1632 

52 9 Leu Glu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Glu Asp Leu 544 



163 3 CAG GTG ATT ACG GTA GGT GAA GCG GGT CGC GCA CAG GTA AGG GTA TTG 
1680 

545 Gin Val He Thr Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu 560 



16 81 CAC TGG GAA AGT GGT AAG CCG ACA GAT ATT GAC AAC AAT CAG GTG CGC 
172B 
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172 9 TAC AGC TAC GAT AAT CTG CTT GGC TCC AGC CAG CTT GAA CTG GAT AGC 
1776 

577 Tyr Ser Tyr Asp Asn Leu Leu Gly Ser Ser Gin Leu Glu Leu Asp Ser 592 



1777 GAA GGG CAG ATT CTC AGT CAG GAA GAG TAT TAT CCG TAT GGC GGT ACG 
1824 

593 Glu Gly Gin He Leu Ser Gin Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr 608 



1825 GCG ATA TGG GCG GCG AGA AAT CAG ACA GAA GCC AGC TAC AAA TTT ATT 
1872 

609 Ala He Trp Ala Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe He 624 



187 3 CGT TAC TCC GGT AAA GAG CGG GAT GCC ACT GGA TTG TAT TAT TAC GGC 
1920 

625 Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly 64 0 



1921 TAC CGT TAT TAT CAA CCT TGG GTG GGT CGA TGG TTG AGT GCT GAT CCG 
1968 

641 Tyr Arg Tyr Tyr Gin Pro Trp Val Gly Arg Trp Leu Ser Ala Asp Pro 656 



196 9 GCG GGA ACC GTG GAT GGG CTG AAT TTG TAC CGA ATG GTG AGG AAT AAC 
2016 

657 Ala Gly Thr Val Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn 672 



2 017 CCC ATC ACA TTG ACT GAC CAT GAC GGA TTA GCA CCG TCT CCA AAT AGA 
2064 

673 Pro He Thr Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg 688 



2 065 AAT CGA AAT ACA TTT TGG TTT GCT TCA TTT TTG TTT CGT AAA CCT GAT 
2112 

689 Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro Asp 704 



2113 GAG GGA ATG TCC GCG TCA ATG AGA CGG GGA CAA AAA ATT GGC AGA GCC 
2160 

705 Glu Gly Met Ser Ala Ser Met Arg Arg Gly Gin Lys He Gly Arg Ala 720 



2161 ATT GCC GGC GGG ATT GCG ATT GGC GGT CTT GCG GCT ACC ATT GCC GCT 
2208 

721 He Ala Gly Gly He Ala He Gly Gly Leu Ala Ala Thr He Ala Ala 736 



22 0 9 ACG GCT GGC GCG GCT ATC CCC GTC ATT CTG GGG GTT GCG GCC GTA GGC 
2256 

737 Thr Ala Gly Ala Ala He Pro Val He Leu Gly Val Ala Ala Val Gly 752 



22 57 GCG GGG ATT GGC GCG TTG ATG GGA TAT AAC GTC GGT AGC CTG CTG GAA 
2304 

753 Ala Gly He Gly Ala Leu Met Gly Tyr Asn Val Gly Ser Leu Leu Glu 768 



2 3 05 AAA GGC GGG GCA TTA CTT GCT CGA CTC GTA CAG GGG AAA TCG ACG TTA 
2352 

769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gin Gly Lys Ser Thr Leu 784 



2 353 GTA CAG TCG GCG GCT GGC GCG GCT GCC GGA GCG AGT TCA GCC GCG GCT 
2400 

785 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 800 
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24 01 TAT GGC GCA CGG GCA CAA GGT GTC GGT GTT GCA TCA GCC GCC GGG GCG 
2448 

5 801 Tyr Gly Ala Arg Ala Gin Gly Val Gly Val Ala Ser Ala Ala Gly Ala B16 

244 9 GTA ACA GGG GCT GTG GGA TCA TGG ATA AAT AAT GCT GAT CGG GGG ATT 
2496 

10 817 Val Thr Gly Ala Val Gly Ser Trp He Asn Asn Ala Asp Arg Gly lie 832 

24 97 GGC GGC GCT ATT GGG GCC GGG AGT GCG GTA GGC ACC ATT GAT ACT ATG 
2544 

15 833 Gly Gly Ala He Gly Ala Gly Ser Ala Val Gly Thr He Asp Thr Met 848 

254 5 TTA GGG ACT GCC TCT ACC CTT ACC CAT GAA GTC GGG GCA GCG GCG GGT 
2592 

20 84 9 Leu Gly Thr Ala Ser Thr Leu Thr His Glu Val Gly Ala Ala Ala Gly 864 

2 5 93 GGG GCG GCG GGT GGG ATG ATC ACC GGT ACG CAA GGG AGT ACT CGG GCA 
2640 

2 5 865 Gly Ala Ala Gly Gly Met He Thr Gly Thr Gin Gly Ser Thr Arg Ala 880 

2 641 GGT ATC CAT GCC GGT ATT GGC ACC TAT TAT GGC TCC TGG ATT GGT TTT 
2 688 

30 881 Gly He His Ala Gly He Gly Thr Tyr Tyr Gly Ser Trp He Gly Phe 896 

26 89 GGT TTA GAT GTC GCT AGT AAC CCC GCC GGA CAT TTA GCG AAT TAC GCA 
2736 

35 897 Gly Leu Asp Val Ala Ser Asn Pro Ala Gly His Leu Ala Asn Tyr Ala 912 

27 37 GTG GGT TAT GCC GCT GGT TTG GGT GCT GAA ATG GCT GTC AAC AGA ATA 
2784 

4 0 913 Val Gly Tyr Ala Ala Gly Leu Gly Ala Glu Met Ala Val Asn Arg He 928 

278 5 ATG GGT GGT GGA TTT TTG AGT AGG CTC TTA GGC CGG GTT GTC AGC CCA 
2832 

4 5 929 Met Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser Pro 944 

2 83 3 TAT GCC GCC GGT TTA GCC AGA CAA TTA GTA CAT TTC AGT GTC GCC AGA 
2880 

50 945 Tyr Ala Ala Gly Leu Ala Arg Gin Leu Val His Phe Ser Val Ala Arg 960 

28 81 CCT GTC TTT GAG CCG ATA TTT AGT GTT CTC GGC GGG CTT GTC GGT GGT 
2928 

55 961 Pro Val Phe Glu Pro He Phe Ser Val Leu Gly Gly Leu Val Gly Gly 976 

2 92 9 ATT GGA ACT GGC CTG CAC AGA GTG ATG GGA AGA GAG AGT TGG ATT TCC 
2976 

60 977 He Gly Thr Gly Leu His Arg Val Met Gly Arg Glu Ser Trp He Ser 992 

2 97 7 AGA GCG TTA AGT GCT GCC GGT AGT GGT ATA GAT CAT GTC GCT GGC ATG 
3024 

65 993 Arg Ala Leu Ser Ala Ala Gly Ser Gly He Asp His Val Ala Gly Met 
1008 

3 02 5 ATT GGT AAT CAG ATC AGA GGC AGG GTC TTG ACC ACA ACC GGG ATC GCT 
70 3072 
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1009 lie Gly As^^ln He Arg Gly Arg Val Leu Thr xnr Thr Gly He Ala 



3 073 AAT GCG ATA GAC TAT GGC ACC AGT GCT GTG GGA GCC GCA CGA CGA GTT 



1025 Asn Ala He Asp Tyr Gly Thr Ser Ala Val Gly Ala Ala Arg Arg Val 
1040 



3121 TTT TCT TTG TAA 3132 
1041 Phe Ser Leu End 1043 



(2) INFORMATION FOR SEQ ID NO: 61 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1043 amino acids 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 (TccC peptide) 



1 


Met 


Ser 


Pro 


Ser Glu Thr Thr Leu Tyr Thr Gin Thr 


Pro 


Thr 


Val 


Ser 


16 


17 


Val 


Leu Asp Asn Arg Gly 


Leu 


Ser lie Arg 


Asp 


He 


Gly 


Phe 


His 


Arg 


32 


33 


He 


Val 


He 


Gly Gly Asp 


Thr 


Asp Thr Arg 


Val 


Thr 


Arg 


His 


Gin 


Tyr 


46 


49 


Asp 


Ala 


Arg 


Gly His 


Leu 


Asn Tyr Ser He 


Asp 


Pro 


Arg 


Leu 


Tyr 


Asp 


64 


65 


Ala 


Lys 


Gin 


Ala Asp 


Asn 


Ser 


Val Lys Pro 


Asn 


Phe 


Val 


Trp 


Gin 


His 


80 


81 


Asp 


Leu 


Ala 


Gly His 


Ala 


Leu 


Arg Thr Glu 


Ser 


Val 


Asp 


Ala 


Gly 


Arg 


96 


97 


Thr 


Val 


Ala 


Leu Asn 


Asp 


He 


Glu Gly Arg 


Ser 


Val 


Met 


Thr 


Met 


Asn 


112 


113 


Ala 


Thr 


Gly 


Val Arg 


Gin 


Thr 


Arg Arg Tyr 


Glu 


Gly 


Asn 


Thr 


Leu 


Pro 


128 


129 


Gly Arg 


Leu 


Leu Ser 


Val 


Ser 


Glu Gin Val 


Phe 


Asn 


Gin 


Glu 


Ser 


Ala 


144 


145 


Lys 


Val 


Thr 


Glu Arg 


Phe 


He 


Trp Ala Gly Asn Thr 


Thr 


Ser 


Glu 


Lys 


160 


161 


Glu 


Tyr 


Asn 


Leu Ser 


Gly 


Leu 


Cys He Arg 


His 


Tyr 


Asp 


Thr 


Ala 


Gly 


176 


177 


Val Thr Arg Leu Met 


Ser 


Gin 


Ser Leu Ala 


Gly Ala 


Met 


Leu 


Ser 


Gin 


192 


193 


Ser 


His 


Gin 


Leu Leu 


Ala 


Glu 


Gly Gin Glu 


Ala 


Asn 


Trp 


Ser 


Gly Asp 


208 


209 


Asp Glu Thr Val Trp Gin Gly Met Leu Ala Ser Glu 


Val 


Tyr 


Thr 


Thr 


224 


225 


Gin 


Ser 


Thr 


Thr Asn 


Ala 


He 


Gly Ala Leu 


Leu 


Thr 


Gin 


Thr 


Asp 


Ala 


240 


241 


Lys 


Gly Asn 


He Gin 


Arg 


Leu 


Ala Tyr Asp 


He 


Ala 


Gly 


Gin 


Leu 


Lys 


256 


257 


Gly 


Ser 


Trp 


Leu Thr 


Val 


Lys 


Gly Gin Ser 


Glu 


Gin 


Val 


He 


Val 


Lys 


272 


273 


Ser 


Leu 


Ser 


Trp Ser 


Ala 


Ala 


Gly His Lys 


Leu 


Arg 


Glu 


Glu 


His 


Gly 


288 


289 


Asn 


Gly 


val 


val Thr 


Glu 


Tyr 


Ser Tyr Glu 


Pro 


Glu 


•Thr 


Gin 


Arg 


Leu 


304 


305 


He 


Gly 


He 


Thr Thr 


Arg 


Arg 


Ala Glu Gly Ser Gin 


Ser 


Gly 


Ala 


Arg 


320 


321 


Val 


Leu 


Gin 


Asp Leu 


Arg 


Tyr 


Lys Tyr Asp 


Pro 


Val 


Gly 


Asn 


Val 


He 


336 


337 


Ser 


He 


His 


Asn Asp 


Ala 


Glu 


Ala Thr Arg 


Phe 


Trp 


Arg 


Asn 


Gin 


Lys 


352 



1024 
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353 


Val 


Glu 


Pro 


G 


sn 


Arg 


Tyr 


Val 


Tyr 


Asp 


Ser 


Leu 




Gin 


Leu 


Met 


368 


369 


Ser 


Ala 


Thr 


Gly 


Arg 


Glu 


Met 


Ala 


Asn 


He 


Gly 


Gin 


Gin 


Ser 


Asn 


Gin 


3 84 


385 


Leu 


Pro 


Ser 


Pro 


Val 


He 


Pro 


Val 


Pro 


Thr 


Asp 


Asp 


Ser 


Thr 


Tyr 


Thr 


400 


401 


Asn 


Tyr 


Leu 


Arg 


Thr 


Tyr 


Thr 


Tyr 


Asp 


Arg 


Gly 


Gly 


Asn 


Leu 


Val 


Gin 


416 


417 


He 


Arg 


His 


Ser 


Ser 


Pro 


Ala 


Thr 


Gin 


Asn 


Ser 


Tyr 


Thr 


Thr 


Asp 


He 


432 


433 


Thr 


Val 


Ser 


Ser 


Arg 


Ser 


Asn 


Arg 


Ala 


Val 


Leu 


Ser 


Thr 


Leu 


Thr 


Thr 


448 


449 


Asp 


Pro 


Thr 


Arg 


Val 


Asp 


Ala 


Leu 


Phe 


Asp 


Ser 


Gly 


Gly 


His 


Gin 


Lys 


464 


465 


Met 


Leu 


He 


Pro 


Gly 


Gin 


Asn 


Leu 


Asp 


Trp 


Asn 


He 


Arg 


Gly 


Glu 


Leu 


480 


481 


Gin 


Arg 


Val 


Thr 


Pro 


Val 


Ser 


Arg 


Glu 


Asn 


Ser 


Ser 


Asp 


Ser 


Glu 


Trp 


4 96 


497 


Tyr 


Arg 


Tyr 


Ser 


Ser 


Asp 


Gly 


Met 


Arg 


Leu 


Leu 


Lys 


Val 


Ser 


Glu 


Gin 


512 


513 


Gin 


Thr 


Gly 


Asn 


Ser 


Thr 


Gin 


Val 


Gin 


Arg 


Val 


Thr 


Tyr 


Leu 


Pro 


Gly 


528 


529 


Leu 


Glu 


Leu 


Arg 


Thr 


Thr 


Gly 


Val 


Ala 


Asp 


Lys 


Thr 


Thr 


Glu 


Asp 


Leu 


544 


545 


Gin 


Val 


He 


Thr 


Val 


Gly 


Glu 


Ala 


Gly 


Arg 


Ala 


Gin 


Val 


Arg 


Val 


Leu 


560 


561 


His 


Trp 


Glu 


Ser 


Gly 


Lys 


Pro 


Thr 


Asp 


He 


Asp 


Asn 


Asn 


Gin 


Val 


Arg 


576 


577 


Tyr 


Ser 


Tyr 


Asp 


Asn 


Leu 


Leu 


Gly 


Ser 


Ser 


Gin 


Leu 


Glu 


Leu 


Asp 


Ser 


592 


593 


Glu 


Gly 


Gin 


He 


Leu 


Ser 


Gin 


Glu 


Glu 


Tyr 


Tyr 


Pro 


Tyr 


Gly 


Gly 


Thr 


608 


609 


Ala 


He 


Trp 


Ala 


Ala 


Arg 


Asn 


Gin 


Thr 


Glu 


Ala 


Ser 


Tyr 


Lys 


Phe 


He 


624 


625 


Arg 


Tyr 


Ser 


Gly 


Lys 


Glu 


Arg 


Asp 


Ala 


Thr 


Gly 


Leu 


Tyr 


Tyr 


Tyr 


Gly 


640 


641 


Tyr 


Arg 


Tyr 


Tyr 


Gin 


Pro 


Trp 


Val 


Gly 


Arg 


Trp 


Leu 


Ser 


Ala 


Asp 


Pro 


656 


657 


Ala 


Gly 


Thr 


Val 


Asp 


Gly 


Leu 


Asn 


Leu 


Tyr 


Arg 


Met 


Val 


Arg 


Asn 


Asn 


672 


673 


Pro 


He 


Thr 


Leu 


Thr 


Asp 


His 


Asp 


Gly 


Leu 


Ala 


Pro 


Ser 


Pro 


Asn 


Arg 


688 


689 


Asn 


Arg 


Asn 


Thr 


Phe 


Trp 


Phe 


Ala 


Ser 


Phe 


Leu 


Phe 


Arg 


Lys 


Pro 


Asp 


704 


705 


Glu 


Gly 


Met 


Ser 


Ala 


Ser 


Met 


Arg 


Arg 


Gly 


Gin 


Lys 


He 


Gly 


Arg 


Ala 


720 


721 


He 


Ala 


Gly 


Gly 


He 


Ala 


He 


Gly 


Gly 


Leu 


Ala 


Ala 


Thr 


He 


Ala 


Ala 


736 


737 


Thr 


Ala 


Gly 


Ala 


Ala 


He 


Pro 


Val 


lie 


Leu 


Gly 


Val 


Ala 


Ala 


Val 


Gly 


752 


753 


Ala 


Gly 


He 


Gly 


Ala 


Leu 


Met 


Gly 


Tyr 


Asn 


Val 


Gly 


Ser 


Leu 


Leu 


Glu 


768 


769 


Lys 


Gly 


Gly 


Ala 


Leu 


Leu 


Ala 


Arg 


Leu 


Val 


Gin 


Gly 


Lys 


Ser 


Thr 


Leu 


784 


785 


Val 


Gin 


Ser 


Ala 


Ala 


Gly 


Ala 


Ala 


Ala 


Gly 


Ala 


Ser 


Ser 


Ala 


Ala 


Ala 


800 


801 


Tyr 


Gly 


Ala 


Arg 


Ala 


Gin 


Gly 


Val 


Gly 


Val 


Ala 


Ser 


Ala 


Ala 


Gly 


Ala 


816 


817 


Val 


Thr 


Gly 


Ala 


Val 


Gly 


Ser 


Trp 


He 


Asn 


Asn 


Ala 


Asp 


Arg 


Gly 


He 


832 


833 


Gly 


Gly 


Ala 


He 


Gly 


Ala 


Gly 


Ser 


Ala 


Val 


Gly 


Thr 


He 


Asp 


Thr 


Met 


848 


849 


Leu 


Gly 


Thr 


Ala 


Ser 


Thr 


Leu 


Thr 


His 


Glu 


Val 


Gly 


Ala 


Ala 


Ala 


Gly 


864 


865 


Gly 


Ala 


Ala 


Gly 


Gly 


Met 


He 


Thr 


Gly 


Thr 


Gin 


Gly 


Ser 


Thr 


Arg 


Ala 


880 


881 


Gly 


He 


His 


Ala 


Gly 


He 


Gly 


Thr 


Tyr 


Tyr 


Gly 


Ser 


Trp 


lie 


Gly 


Phe 


896 


897 


Gly 


Leu 


Asp 


Val 


Ala 


Ser 


Asn 


Pro 


Ala 


Gly 


His 


Leu 


Ala 


Asn 


Tyr 


Ala 


912 


913 


Val 


Gly 


Tyr 


Ala 


Ala 


Gly 


Leu 


Gly 


Ala 


Glu 


Met 


Ala 


Val 


Asn 


Arg 


He 


928 
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929 


Met 


Gly 


Gly 


Gly 


Phe 


Leu 


Ser Arg 


Leu 


Leu 


Gly 


Arg 


Val 


Val 


Ser Pro 


944 


945 


Tyr 


Ala 


Ala 


Gly 


Leu 


Ala 


Arg Gin 


Leu 


Val 


His 


Phe 


Ser 


val 


Ala Arg 


960 


961 


Pro 


Val 


Phe 


Glu 


Pro 


He 


Phe Ser 


Val 


Leu 


Gly 


Gly 


Leu 


Val 


Gly Gly 


976 


977 


He 


Gly 


Thr 


Gly 


Leu 


His 


Arg Val 


Met 


Gly 


Arg 


Glu 


Ser 


Trp 


lie Ser 


992 


993 


Arg 


Ala 


Leu 


Ser 


Ala 


Ala 


Gly Ser 


Gly 


He 


Asp 


His 


Val 


Ala 


Gly Met 


1008 


1009 


He 


Gly 


Asn 


Gin 


He 


Arg 


Gly Arg 


Val 


Leu 


Thr 


Thr 


Thr 


Gly 


He Ala 


1024 


1025 


Asn 


Ala 


Tie 


Asp 


Tyr 


Gly 


Thr Ser 


Ala 


Val 


Gly Ala 


Ala 


Arg 


Arg Val 


1040 


1041 


Phe 


Ser 


Leu 


1043 























(2) INFORMATION FOR SEQ ID NO: 62: TcaA iv 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: internal 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: TcaA lv 

Asn He Gly Gly Asp 

1 5 

(2) INFORMATION FOR SEQ ID NO:63: TcaA ±i -syn 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
4 0 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: TcaA^-syn 

Cys Leu Arg Gly Asn Ser Pro Thr Asn Pro Asp Lys Asp Gly He 
1 5 10 15 

Phe Ala Gin Val Ala 
20 



(2) INFORMATION FOR SEQ ID NO: 64: TcaA i:U -syn 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 20 amino acids 
55 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: Internal 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: TcaA Ui -syn 
Cys Tyr Thr Pro Asp Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe 
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1 .10 15 

Arg Ser Ala Asp Gly 
20 

5 (2) INFORMATION FOR SEQ ID NO: 65: TcaB,_-syn 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: Internal 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: TcaBi-syn 

His Gly Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu 
15 10 15 

Ser lie Asn Thr 
20 19 

(2) INFORMATION FOR SEQ ID NO: 66: TcaB i:l -syn 

(i) SEQUENCE CHARACTERISTICS: 
2 5 (A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULAR TYPE: protein 

30 (v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: TcaB^-syn 

Cys Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala Gly Gly Asp 
35 1 5 10 15 

Gly Thr Gly Ser Ser 
20 



40 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 67: TcaC-syn 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
4 5 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: internal 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: TcaC-syn 

Cys Tyr Lys Ala Pro Gin Arg Gin Glu Asp Gly Asp Ser Asn Ala 

15 10 15 

Val Thr Tyr Asp Lys 
20 
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(2) I N FORMAT I ONrOR SEQ ID NO: 68: TchA i;i -syn 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: internal 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: TcbA^-syn 

Cys Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe 
15 10 15 

Ser Ser Lys Asp Asp 
15 20 

(2) INFORMATION FOR SEQ ID NO: 69: TcbA ii:L -syn 

(i) SEQUENCE CHARACTERISTICS : 
20 (A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

2 5 (v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: TcbA Ui -syn 

Cys Phe Asp Ser Tyr Ser Gin Leu Tyr Glu Glu Asn He Asn Ala 
30 1 5 10 15 

Gly Glu Gin Arg Ala 
20 



(2) INFORMATION FOR SEQ ID NO: 70: TcdA^-syn 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
4 0 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE : protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: TcdA^-syn 

Cys Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val 
15 10 15 

Tyr Gin Tyr Ser Gly Asn Thr 
20 

(2) INFORMATION FOR SEQ ID NO: 71: TcdA i;ii - syn 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 
55 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 71: TcdA Ui -syn 
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Val Ser Gin Gly Ser Gly Ser Ala Gly Ser Gly Asn Asn Asn Leu 

15 10 15 

Ala Phe Gly Ala Gly 
5 20 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

15 (v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 160 kDa - Hb 

Met Gin Asp Ser Pro Glu Val Ala lie Thr Thr Leu 
20 1 5 10 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

30 (v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 170 kDa - WIR 

Met Gin Arg Ser Ser Glu Val Ser 

35 l 5 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
4 0 (A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

4 5 (v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 180 kDa - H9 

Met Gin Asp lie Pro Glu Val Gin Leu Asn 
50 l 5 10 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 170 kDa - Hm ( 2 ) 
INFORMATION FOR SEQ ID NO: 75: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
<B) TYPE: amino acid 
(C) STRANDEDNESS: single 
60 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N- terminal 



65 Met Gin Asp Ser Pro Glu Val Ser Val Thr Gin Asn 
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(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE : protein 
(v) FRAGMENT TYPE: N- terminal 



15 



35 



45 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: 74 kDa - H9 

Ser Glu Ser Leu Phe Thr Gin Ser Leu Lys Glu Ala Arg Arg Asp 
1 5 10 15 



20 (2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

25 <C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE : protein 
(v) FRAGMENT TYPE: N- terminal 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 71 kDa - Hb 

Met Asn Leu lie Glu Ala Lys Leu Gin Glu Asn Arg Asp Ala 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 76: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 15 amino acids 
4 0 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N~ terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 170 kDa - H9 

Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE : protein 

60 (v) FRAGMENT TYPE : N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 109 kDa - Hm 

Met Leu Asp He Met Glu Lys Gin Leu Asn Glu Ser Glu Arg Asp 
65 1 5 10 15 
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(2) INFORMATION FOR SEQ ID NO: 80: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
10 (ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 170 kDa - WX-1 

15 Met Gin Asp Ser Arg Glu Val Ser 

1 5 



20 



30 



50 



60 



65 



(2) INFORMATION FOR SEQ ID NO: 81: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
2 5 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N-terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 69 kDa - H9 

Leu Arg Ser Ala Xxx Ser Ala Leu Thr Thr Leu Leu 
15 10 



35 (2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

4 0 (C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N-terminal 

4 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 64 kDa - HP88 



Leu Lys Leu Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 83: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 
55 (B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingl e 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 70 kDa - NC-1 

Leu Lys Leu Ala Asp Asn Ser Tyr Phe Asn Glu Pro Leu Asn 

1 5 10 15 
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(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

10 (v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:84: 60 kDa - WIR 

Ser Lys Asp Glu Ser Lys Ala Asp Ser Gin Leu Val Tyr His Thr 
15 1 5 10 15 



(2) INFORMATION FOR SEQ ID NO: 85: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 5 (ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 5 8 kDa - NC-1 

30 Met Lys Lys Arg Gly Leu Thr Thr Asn Ala Gly Ala Pro Val 

15 10 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 86 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
4 0 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 60 kDa - WX-12 

Met Leu Asn Pro lie Val Arg Lys Phe Glu Tyr Gly Glu His Thr 
1 5 10 15 



50 (2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

55 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 
(v) FRAGMENT TYPE: N-terminal 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: 60 kDa - Hm 

Ala Glu lie Tyr Asn Lys Asp Gly Asn Lys Leu Asp Leu Tyr Glv 
15 10 15 

65 

-294- 

SUBSTTTUTE SHEET (RULE 26) 



BNSOOCID: <WO 9808932A1 J_> 



WO 98/08932 



PCT/US97/07657 



10 



15 



(2) INFORMATION FO^g^Q ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: 140 kDa - Hm 

Asn Leu lie Glu Ala Thr Leu Glu Gin Asn Leu Arg Asp Ala 

1 5 10 15 
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We claim: 

1. A composition, comprising an effective amount of a 
Photorhabdus protein toxin that has functional activity 

5 against an insect . 

2. The composition of Claim 1, wherein the Photorhabdus 
toxin is produced by a purified culture of Photorhabdus, a 
transgenic plant, baculovirus, or heterologous microbial host. 

10 

3. The composition of Claim 2, wherein the Photorhabdus 
toxin produced by a purified culture of Photorhabdus 
luminescens . 

15 4. The composition of Claim 2, wherein the toxin is 

produced from a purified culture of Photorhabdus luminescens 
strain designated ATCC 55397. 

5. The composition of Claim 2, wherein the toxin is 
2 0 produced by a purified culture of Photorhabdus luminescens 

strain designated W-14. 

6. The composition of Claim 1, wherein the toxin is 
produced by a purified culture of Photorhabdus strain 

25 designated WX-1, WX-2, WX-3 , WX-4, WX-5, WX6 , WX-7, WX-8, WX- 
9, WX-10, WX-11, WX-12, WX-14, WX-15, H9, Hb, Hm, HPB8 , NC-1, 
W30, WIR, B2, ATCC# 43948, ATCCtf 43949, ATCC# 43950, ATCC# 
43951, ATCC# 43952, DEP1 , DEP2 , DEP3 , P. zealandrica, P. 
hepialus, HB-Arg, HB Oswego, HB Oswego, HB Lewiston, K-122, 

30 HMGD , Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MP1, MP2 , 
MP3, MP4, MPS, GL98, GL101, GL138, GL55, GL217, or GL257. 

7. The composition of Claim 2, wherein the toxin is 
produced from a purified culture of Photorhabdus luminescens 

35 strain designated WX-1, WX-2, WX-3, WX-4, WX-5, WX-6, WX-7, 
WX-8, WX-9, WX-10, WX-11, WX-12, WX-14, WX-15, H9 , Hb , Hm, 
HP88, NC-1, W30, WIR , B2 , ATCC# 43948, ATCC# 43949, ATCC# 
43950, ATCCtf 43951, ATCC# 43952, DEP1 , DEP2 , DEP3 , P. 
zealandrica, P. hepialus, HB-Arg, HB Oswego, HB Oswego, HB 

4 0 Lewiston, K-122, HMGD, Indicus, GD , PWH-5, Megidis, HF-85, A. 
Cows, MP1, MP2 , MP3, MP4, MP5 , GL98, GL101, GL138, GL55 , 
GL217, or GL257. 
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8. The composition of Claim 1, wherein the toxin is 
represented by amino acid sequence is SEQ ID NO: 12. 

5 9. The composition of Claim 6, wherein the composition 

is a mixture of one or more toxins produced from purified 
cultures of Photorhabdus. 

10. The composition of Claim 1 or 6 , wherein the insect 
10 is of the order Lepidoptera, Coleoptera, Hymenoptera, Diptera, 

Dictyoptera, Acarina or Homoptera. 

11. The composition of Claim 1 or 6, wherein the insect 
species is from order Coleoptera and is Southern Corn 

15 Rootworm, Western Corn Rootworm, Colorado Potato Beetle, 
Mealworm, Boll Weevil or Turf Grub. 

12. The composition of Claim 1 or 6 , wherein the insect 
species is from order Lepidoptera and is Beet Armyworm, Black 

20 Cutworm, Cabbage Looper, Codling Moth, Corn Earworm, European 
Corn Borer, Tobacco Hornworm, or Tobacco Budworm. 

13. The composition of Claim 1 or 6 , wherein the toxin 
is formulated as a sprayable insecticide. 

25 

14. The composition of Claim 1 or Claim 6, wherein the 
toxin is formulated as a bait matrix and delivered in an above 
ground or below ground bait station. 

30 15. A method of controlling an insect, comprising orally 

delivering to an insect an effective amount of a protein toxin 
that has functional activity against an insect, wherein the 
protein is produced by a purified bacterial culture of the 
genus Photorhabdus . 

35 

16. The method of Claim 15, wherein the bacterium is a 
purified culture of Photorhabdus luminescens. 

17. The method of Claim 15, wherein the toxin is 

4 0 produced from a purified culture of Photorhabdus luminescens 
strain designated ATCC 55397. 
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18 . 



The 




hod of Claim 16 , wherein th^^bxin is 




produced from a purified culture of Photorhabdus luminescens 
strain designated W-14. 



produced from a purified culture of Photorhabdus strains 
designated WX-1, WX-2, WX-3, WX-4, WX-5, WX-6, WX-7, WX-8, WX- 
9, WX-10, WX-11, WX-12, WX-14, WX-15, H9, Hb , Hm, HP88, NC-1, 
W30, WIR, B2, ATCC# 43948, ATCC# 43949, ATCC# 43950, ATCC# 
10 43951, ATCC# 43952, DEP1, DEP2 , DEP3 , P. zealandrica, P. 

hepialus, HB-Arg , HB Oswego, HB Oswego, HB Lewiston, K-122, 
HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MP1, MP2 , 
MP3, MP4, MP5, GL98, 

GL101, GL138, GL155, GL217, or GL257 . 



20. The method of Claim 15, wherein the toxin is 
produced from a purified culture of PhotorhaJbdus luminescens 
strains designated WX-1, WX-2, WX-3, WX-4, WX-5, WX-6, WX-7, 
WX-8, WX-9, WX-10, WX-11, WX-12, WX-14, WX-15, H9, Hb, Hm, 

20 HP88, NC-1, W30, WIR, B2 , ATCC# 43948, ATCCtf 43949, ATCC# 
43950, ATCC# 43951, ATCC# 43952, DEP1 , DEP2 , DEP3 , P. 
zealandrica, P. hepialus, HB-Arg , HB Oswego, HB Oswego, HB 
Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. 
Cows, MP1, MP2, MP3 , MP4, MP5 , GL98, GL101, GL138, GL155, 

2 5 GL217, or GL257. 

21. The method of Claim 19, wherein a mixture of one or 
more toxins is produced from a purified culture of 
Photorhabdus and said toxins are orally delivered to an 

30 insect.' 

22. The method of Claim 15, wherein the toxin is 
produced by a prokaryotic host transformed with a gene 
encoding the toxin. 



23. The method of Claim 15, wherein the toxin is 
produced by a eukaryotic host transformed with a gene encoding 
the toxin . 



5 



19. 



The method of Claim 15, wherein the toxin is 



15 



35 



40 



24. The method of Claim 23, wherein the eukaryotic host 
is baculovirus. 
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25. The metiw of Claim 15 or 19, wherein insect is 

of the order Lepldoptera , Coleoptera, Hymenoptera , Diptera, 
Dictyoptera, Acarina or Homoptera . 




5 



26. The method of Claim 15 or 19, wherein the insect 
species is from order Coleoptera and is Southern Corn 
Rootworm, Western Corn Rootworm, Colorado Potato Beetle, 
Mealworm, Boll Weevil or Turf Grub. 



10 



27 . 



The method of Claim 15 or 19, wherein the insect 



species is from order Lepidoptera and is Beet Armyworm, Black 
Cutworm, Cabbage Looper, Codling Moth, Corn Earworm, European 
Corn Borer, Tobacco Hornworm, or Tobacco Budworm. 



formulated as a sprayable insecticide. 

29. The method of Claim 15 or Claim 19, wherein the 
toxin is formulated as a bait matrix and delivered in an above 

20 ground or below ground bait station. 

30. A method of isolating a gene coding for a protein 
subunit, comprising the steps of: constructing at least one 
RNA or DNA oligonucleotide molecule that corresponds to at 

2 5 least a part of a DNA coding region of an amino acid sequence 
selected from a group consisting of SEQ ID NO:l, SEQ ID NO : 2 , 
SEQ ID NO: 3, SEQ ID NO : 4 , SEQ ID NO : 5 , SEQ ID NO : 6 , SEQ ID 
NO:7 f SEQ ID NO: 8, SEQ ID NO : 9 , SEQ ID NO: 10, SEQ ID NO: 13, 
SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID 

30 NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, 
SEQ ID NO:23, SEQ ID N0:24, SEQ ID NO:36, SEQ ID NO:37, SEQ ID 
NO:38, SEQ ID NO : 3 9 , SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, 
SEQ ID NO: 43, SEQ ID NO: 62, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID 
NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, 

35 SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO : 8 1 , SEQ ID NO:82, SEQ ID 
NO: 83, SEQ ID NO: 84, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, 
and SEQ ID NO: 88, wherein the nucleotide molecule is used to 
isolate genetic material from Photorhabdus or Photorhabdus 
luminescens . 



31. A method for expressing a protein produced by a 
purified bacterial culture of the genus Photorhabdus in a 
prokaryotic or eukaryotic host in an effective amount so that 
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28. The method of Claim 15 or 19, wherein the toxin is 
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the protein hal^^mct ional activity against an insect, wherein 
the method comprises : constructing a chimeric DNA construct 
having 5' to 3 1 a promoter, a DNA sequence encoding a protein, 
a transcription terminator, and then transferring the chimeric 
5 DNA construct into the host. 

32. The method of Claim 31, wherein the protein has 
functional activity against insects selected from a group 
consisting of Coleoptera, Lepidoptera , Diptera, Homoptera, 

10 Hymenoptera, Dictyoptera, and Acarina. 

33. The method of Claim 31, wherein the protein encoded 
by the DNA sequence has an N- terminal amino acid sequence 
selected from the group consisting of SEQ ID NO:l, SEQ ID 

15 NO: 2, SEQ ID NO : 3 , SEQ ID NO : 4 , SEQ ID NO : 5 , SEQ ID NO : 6 , SEQ 
ID NO: 7, SEQ ID NO : 8 , SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 
13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, 
SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID 
NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:36, SEQ ID NO:37, 

20 SEQ ID NO:38, SEQ ID NO : 3 9 , SEQ ID NO:40, SEQ ID NO:41, SEQ ID 
NO: 42, SEQ ID NO: 43, SEQ ID NO: 62, SEQ ID NO: 72, SEQ ID NO: 73, 
SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77 , SEQ ID 
NO: 78, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, 
SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID 

2 5 NO: 87, and SEQ ID NO: 88. 

34. The method of Claim 31, wherein the protein encoded 
by the DNA sequence includes the amino acid sequence selected 
from the group consisting of SEQ ID NO: 12, SEQ ID NO: 26, SEQ 

30 ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 

NO:35, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, 
SEQ ID NO:5S, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:61. 

35. A chimeric DNA construct, adapted for expression in 
35 a prokaryotic or eukaryotic host comprising, 5' to 3' a 

transcriptional promoter active in the host; a DNA sequence 
encoding a Photorhabdus protein that has functional activity 
against an insect; and a transcriptional terminator. 

40 36. A chimeric DNA construct of Claim 35, wherein the 

protein encoded by the DNA sequence has an N-terminal amino 
acid sequence selected from the group consisting of SEQ ID 
NO:l, SEQ ID NO : 2 , SEQ ID NO : 3 , SEQ ID NO : 4 , SEQ ID NO: 5, SEQ 
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ID NO: 6, SEQ ID NiW, SEQ ID NO : 8 # SEQ ID NO : 9 , ^BC ID NO: 10, 

SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ 
ID N0:17, SEQ ID NO:18 r SEQ ID NO:19, SEQ ID NO:20, SEQ ID 
NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 36, 
5 SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID 
NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:62, SEQ ID NO:72, 
SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID 
NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, 
SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID 
10 NO: 86, SEQ ID NO: 87, and SEQ ID NO: 88. 

37. The chimeric DNA construct of Claim 35, wherein the 
protein encoded by the DNA sequence has an amino acid sequence 
selected from the group consisting of SEQ ID NO: 12, SEQ ID 
15 NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, 
SEQ ID NO: 35, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID 
NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, and SEQ ID 
NO : 6 1 . 

20 38. The chimeric DNA construct of Claim 35, wherein the 

DNA sequence encoding the Photorhabdus luminescens protein is 
selected from the group comprising SEQ ID NO: 11, SEQ ID NO: 25, 
SEQ ID NO:27, SEQ ID NO : 2 9 , SEQ ID NO:31, SEQ ID NO:33, SEQ ID 
NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO : 54 , 

2 5 SEQ ID NO: 56, SEQ ID NO: 58, and SEQ ID NO: 60. 

39. The chimeric DNA construct of Claim 35, wherein the 
host is baculovirus or a plant cell. 

30 40. An isolated and substantially purified preparation 

comprising, a DNA molecule capable of encoding an effective 
amount of a protein that is produced by a bacterium of the 
genus Photorhabdus and that has functional activity against an 
insect . 



41. The preparation of Claim 40, wherein the bacterium 
is Photorhabdus luminescens. 



42. A purified preparation comprising, a protein 
4 0 produced by Photorhabdus or Photorhabdus luminescens having an 
N-terminal amino acid sequence selected from the group 
consisting of SEQ ID NO : 1 , SEQ ID NO : 2 , SEQ ID NO : 3 , SEQ ID 
NO: 4, SEQ ID NO: 5, SEQ ID NO : 6 , SEQ ID NO : 7 , SEQ ID NO : 8 , SEQ 
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ID NO: 9, SEQ HJTIOrlO, SEQ ID NO: 13,. SEQ ID NO: 14, SEQ ID 

NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, 

SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID 

NO: 24, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, 

SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID 

NO: 62, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, 

SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID 

NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, 

SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, and SEQ ID NO: 88. 



43. A purified protein preparation comprising, a protein that 
has an N-terminal amino acid sequence selected from the group 
consisting of SEQ ID NO:l, SEQ ID NO : 2 , SEQ ID NO : 3 , SEQ ID 
NO: 4, SEQ ID NO : 5 , SEQ ID NO : 6 , SEQ ID NO : 7 , SEQ ID NO: 8, SEQ 

15 ID NO: 9, and SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID 
NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, 
SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID 
NO:24, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO : 4 0 , SEQ ID NO:41, 
SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:62, SEQ ID NO:72, SEQ ID 

20 NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, 
SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO-.Bl, SEQ ID 
NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, 
SEQ ID NO: 87, and SEQ ID NO: 88. 

25 

44. A purified protein preparation comprising, a protein 
selected from the group of SEQ ID NO: 12, SEQ ID NO: 26, SEQ ID 
NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:35, 
SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID 
30 NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, and SEQ ID NO: 61. 



45. A purified DNA preparation comprising, a DNA 
sequence selected from the group consisting of SEQ ID NO: 11, 
SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID 

35 NO:33, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, 
SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58 and SEQ ID NO: 60, 
wherein the DNA sequence is isolated from its native host . 

46. A purified protein preparation comprising, a 

4 0 Photorhabdus luminescens protein with at least one subunit 

having an approximate molecular weight between 18 kDa to about 
230 kDa; between about 160 kDa to about 23 0 kDa; 100 kDa to 



-302- 

SUBSTTTUTE SHEET (RULE 26) 



BNSDOCID: <WO 9808932A1 J_> 



10 



WO 98/08932 PCT/US97/07657 

M A 

16 0 kDa; about 80 l^to about 10 0 kDa ; or about l^kDa to 
about 80 kDa. 

47. A purified protein preparation comprising, a 
Photorhahdus luminescens protein with at least one subunit 
having an approximate molecular weight of about 280 kDa. 

48. A substantially pure microorganism culture 
comprising, ATCC 55397. 

49. The culture of Claim 48, wherein the culture is a 
derivative of ATCC 55397 that produces a protein toxin that 
has functional activity against an insect. 

15 50. A transgenic plant comprising in its genome, a 

chimeric artificial gene construction imbuing the plant with 
an ability to express an effective amount of a Photorhahdus 
protein that has functional activity against an insect. 

20 51. The transgenic plant of Claim 50, wherein the plant 

is transformed using acceleration of genetic material coated 
onto microparticles directly into cells, Agrobacteria, 
whiskers, or electroporation techniques 

25 52. The transgenic plant of Claim 50, wherein the 

selectable marker is selected from the group consisting of 
kanamycin, neomycin, glyphosate, hygromycin, methotrexate, 
phosphinothricin (bialophos) , chlorosulf uron, bromoxynil, 
dalapon and the like. 

30 

53. The transgenic plant of Claim 50, wherein the 
promoter is selected from the group consisting of octopine 
synthase, nopaline synthase, mannopine synthase, 35S, 19S, 
35T, ribulose-1 , 6 -bisphosphate (RUBP) carboxylase small 

35 subunit (ssu) , beta-conglycinin, phaseolin, alcohol 

dehydrogenase (ADH) , heat-shock, ubiquitin, zein, oleosin, 
napin, or acyl carier protein (ACP) . 

54. The transgenic plant of Claim 50, wherein 

40 embryogenic tissue, callus tissue type 1 or II, hypocotyl , 

meristem, or plant tissue during dedif f erent iation is used in 
preparing the transgenic plant . 
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55. The transgenic plant of Claim 50, wherein the 
chimeric gene is a DNA sequence which encodes a Photorhabdus 
protein that has functional activity against an insect and at 

5 least one codon of the gene has been modified so that the 
codon is a plant preferred codon. 

56. A method of controlling an insect comprising orally 
delivering to an insect an effective amount of a protein 

10 toxin, wherein the protein is produced by a transgenic plant, 
which said insect feeds. 

57. A composition of matter, comprising a purified DNA 
sequence from a purified bacterial culture from the genus 

15 Photorhabdus, 



58 . A substantially pure microorganism culture 
comprising, 
20 H9 . 

59. A substantially pure microorganism culture 
comprising, 

Hb. 

25 

60. A substantially pure microorganism culture 
comprising, 

Hm. 

30 61. A substantially pure microorganism culture 

comprising , 
HP88 . 

62. A substantially pure microorganism culture 
35 comprising, 

NC-1 . 

63. A substantially pure microorganism culture 
comprising, 

40 W30. 

64 . A substantially pure microorganism culture 
comprising, 
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WIR. 

65. A substantially pure microorganism culture 
comprising, 

5 B2 . 

66. A substantially pure microorganism culture 
comprising , P. zealandrica . 

10 67 . A substantially pure microorganism culture 

comprising, P . hepialus . 

68 . A substantially pure microorganism culture 
comprising, HB-Arg . 

15 

69. A substantially pure microorganism culture 
comprising, HB Oswego. 

70. A substantially pure microorganism culture 
20 comprising, HB Lewiston. 

71. A substantially pure microorganism culture 
comprising, K-122 . 

25 72. A substantially pure microorganism culture 

comprising, HMGD . 

73 . A substantially pure microorganism culture 
comprising, Indicus . 

30 

74 . A substantially pure microorganism culture 
comprising, GD . 

75. A substantially pure microorganism culture 
3 5 comprising, PWH-5. 

76 . A substantially pure microorganism culture 
comprising, Megidis . 

40 77. A substantially pure microorganism culture 

comprising, HF-85. 
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78 . A suofftantially pure microorganism culture 
compris ing , A . Cows . 

79. A substantially pure microorganism culture 
5 comprising, MP1 . 

80. A substantially pure microorganism culture 
compr i s ing , MP2 . 

10 81. A substantially pure microorganism culture 

comprising, MP3 . 

82. A substantially pure microorganism culture 
comprising, MP4 . 

15 

83. A substantially pure microorganism culture 
comprising, MP5 . 

84 . A substantially pure microorganism culture 
20 comprising, GL98 . 

85. A substantially pure microorganism culture 
comprising, GL155 . 

25 86. A substantially pure microorganism culture 

comprising, GL101. 

87. A substantially pure microorganism culture 
comprising, GL138. 

30 

88. A substantially pure microorganism culture 
comprising, GL217. 

89. A substantially pure microorganism culture 

3 5 comprising, GL257. 

90 . A method of making an antibody against a protein 
fragment that is part of a protein having functional activity, 
where the protein is produced by bacteria of the 

4 0 Enterobacteracaea family, wherein the method comprises: 

a) isolating a fragment of the protein, where the 
protein fragment is at least six amino acids; 
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b) immunizing a mammalian species with the protein 
fragment; and 

5 c) harvesting serum containing antibody or antibody from 

the spleen of the mammalian species, where the antibody 
harvested is antibody to the protein fragment having 
functional activity. 

10 91. The method of Claim 1, wherein the protein fragment 

is selected from the group consisting of SEQ ID NO: 63, SEQ ID 
NO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ ID NO:67, SEQ ID N0:68, 
SEQ ID NO: 69, SEQ ID NO: 70, and SEQ ID NO: 71. 

15 92. The method of Claim 90, wherein the bacteria is from 

the genus Photorhabdus . 

93. The method of Claim 90, wherein the bacteria is from 
the genus Photorhabdus lumlnescens . 

94 . A method of selecting a DNA fragment which encodes a 
portion of a protein that has functional activity, where the 
protein is produced from a bacteria of the Enterobacteracaea 
family, wherein the method comprises: 

a) isolating a fragment of the DNA sequence having at 
least 30 nucleotides; 

b) tagging the DNA fragment with a radioactive or 
30 chemical agent; 

c) hybridizing the DNA fragment to a DNA library, where 
the DNA library is an Enterobacteracaea cDNA or 
Enterobacteracaea genomic library; and. 

35 

d) selecting the fragment that is hybridized to the DNA 
in the library that encodes for the protein that has 
functional activity . 

4 0 95. The method of Claim 94, wherein the bacteria is from 

the genus Photorhabdus, 



20 



25 
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hod of Claim 95, wherein th^bacteria is from 
the genus Photorhabdus luminescens. 

97. A method of selecting a DNA fragment which encodes a 
5 portion of a protein that has functional activity, where the 

protein is produced from a bacteria of the Enterobacteracaea 
family, wherein the method comprises: 

a) isolating at least two primers, where a primer is a 
10 fragment of DNA having at least twelve nucleotides; 

b) using the primers from step a) , amplifying a DNA 
fragment from Enterobacteracaea by using primers with 
polymerase chain reaction technology and purifying the DNA 

15 fragment; 

c) tagging the purified DNA fragment with a radioactive 
or chemical agent; 

20 d) hybridizing the purified DNA fragment to a DNA 

library, where the DNA library is an Enterobacteracaea cDNA or 
Enterobacteracaea genomic library;. and 

e) selecting a DNA fragment that is equal or larger in 
25 size to the purified DNA fragment from the library, where the 
selected DNA fragment or portion thereof encodes for a protein 
that has functional activity. 

98. The method of Claim 97, wherein the bacteria is from 
3 0 the genus Photorhabdus. 

99. The method of Claim 98, wherein the bacteria is from 
the genus Photorhabdus luminescens. 

35 
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FIG. 3 Physical Map of DNA fragments of tcb locus. 
Estimated distance between fragments given in nucleotides. 
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